Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia tests don't pass #13

Open
timholy opened this issue Feb 10, 2019 · 20 comments
Open

Julia tests don't pass #13

timholy opened this issue Feb 10, 2019 · 20 comments
Labels
help wanted Extra attention is needed

Comments

@timholy
Copy link
Member

timholy commented Feb 10, 2019

Here's a status list for progress with JuliaInterpreter, running Julia's own test suite. It's organized by the number of passes, failures (ideally should be 0), errors (ideally 0), broken (these are not JuliaInterpreter's problem), and aborted blocks (tests that took too long, given the settings). Some tests error outside a @test (marked by "X" in the table below) and others cause Julia itself to exit (marked by ☠️)
The tests below were run on a multiprocessor server from the Linux command line with

$ JULIA_CPU_THREADS=8 julia --startup-file=no juliatests.jl --nstmts 1000000 --skip compiler

The --nstmts 1000000 allows you to control the maximum number of interpreter statements per lowered block of code; tests that require more than this are marked as being "aborted." The default setting is 10000 (10^4). The higher you make this number, in general the more tests that should finish, but of course also the longer the suite will take to run. On my laptop, running with 2 worker processes the entire suite takes less than 5 minutes to complete using the default settings.

The remaining arguments are the same as given to Julia's own test/runtests.jl: you can either provide a list of tests you want to run (e.g., julia --startup-file=no juliatests.jl ambiguous), or you can list some to skip (here, all the compiler/* tests). "Blank" indicates that one is running all the tests, so the line above runs everything except those in compiler/*.

The key point of having a status list is that it allows us to discover issues with JuliaInterpreter; consequently, the next step is to use these results to fix those problems. Help is very much wanted! Here are good ways to help out:

  • (moderate) investigate failures and file an issue with a MWE. Highest priority should probably go to ones that caused errors or process exit (note: with the possible exceptions of channels, worlds, and arrayops, it appears that most such errors are due to a single cause, MWE of char crash #28; deleting this block and rebuilding Julia fixes them) (EDIT: all of these appear to be fixed now). Then would be error that occurs outside of tests (the Xs), errors that occur inside a @test (those marked as Errors by the test suite), failures, and of lowest priority the aborted blocks. Note that aborted blocks can lead to test failures due to repeating work (see Compiled resumers #44), so many of these may go away if you increase nstmts. However, note that aborted blocks could indicate that the interpreter has incorrectly gotten itself stuck in an infinite loop (yes, the author has seen that happen), and as a consequence it's possible that some of these too are actually errors.
  • (hard) fix the bugs.

A good way to get started is to pick one test that's exhibiting problems, and uncomment these lines. Then, the easiest way to dive into this is to run tests in a REPL session, e.g.,

include("utils.jl")
const juliadir = dirname(dirname(Sys.BINDIR))
const testdir = joinpath(juliadir, "test")
configure_test()
nstmts = 10000
run_test_by_eval("ambiguous", joinpath(testdir, "ambiguous.jl"), nstmts)  # replace ambiguous with whatever test you want to run

from within JuliaInterpreter's test/ directory. If you get failures, make sure you first check whether they go away if you increase nstmts (typically by 10x or more).

When you see test errors, the expression printed right above it is the one causing the problem. Go into the source code and copy the relevant source lines into a quote block. Once you have a minimal expression ex that triggers a problem, do this:

modexs, _ = JuliaInterpreter.split_expressions(m, ex)
for modex in modexs
    frame = JuliaInterpreter.prepare_thunk(modex)
    nstmtsleft = nstmts
    while true
        ret, nstmtsleft = evaluate_limited!(frame, nstmtsleft, true)
        if isa(ret, Aborted)
            run_compiled(frame)
            break
        elseif isa(ret, Some)
            break
        end
    end
end

where m is the module you want to execute this in. You may want to do

module JuliaTests
using Test, Random
end
m = JuliaTests

to isolate the tests from your current session.

To diagnose problems in greater detail, uncommenting these lines can be a great first start.

Without further ado, here's the current list (note the time of the run to determine how current this is):

Julia Version 1.1.1-pre.0
Commit a84cf6f56c (2019-01-22 04:33 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, haswell)
Test run at: 2019-02-26T11:36:49.456

Maximum number of statements per lowered expression: 1000000

Test file Passes Fails Errors Broken Aborted blocks
ambiguous 62 0 0 2 0
subarray 281 0 0 0 1
strings/basic 87293 0 0 0 3
strings/search 522 0 0 0 0
strings/util 449 0 0 0 0
strings/io 12749 0 0 0 1
strings/types 2302688 0 0 0 3
unicode/utf8 19 0 0 0 0
core X X X X X
worlds X X X X X
keywordargs 126 0 1 0 0
numbers 1387242 0 0 0 9
subtype X X X X X
char 1522 0 0 0 0
triplequote 28 0 0 0 0
intrinsics 44 0 0 0 0
dict X X X X X
hashing X X X X X
iobuffer 200 0 0 0 0
staged 55 5 0 0 0
offsetarray 341 11 0 0 1
arrayops 1833 0 0 0 7
tuple 483 0 1 0 0
reduce 292 0 0 0 2
reducedim 689 0 0 0 1
abstractarray 1791 0 0 0 1
intfuncs 4410 0 0 0 0
simdloop X X X X X
vecelement X X X X X
rational 97522 0 0 0 2
bitarray 897826 0 0 0 9
copy 511 0 1 0 1
math X X X X X
fastmath 907 3 3 0 0
functional 95 0 0 0 0
iterators 1555 0 0 0 2
operators 12922 0 0 0 1
path 274 0 0 12 2
ccall X X X X X
parse 10303 0 0 0 1
loading 2272 289 4 0 9
bigint 2156 0 0 0 4
sorting 4864 0 0 0 4
spawn X X X X X
backtrace 5 9 12 1 0
exceptions 27 19 6 0 0
file X X X X X
read X X X X X
version 2468 0 0 0 1
namedtuple 152 0 8 1 0
mpfr 932 0 0 0 0
broadcast 418 0 5 0 2
complex 8250 0 0 2 1
floatapprox 49 0 0 0 0
reflection X X X X X
regex 29 0 0 0 0
float16 124 0 0 0 0
combinatorics 98 0 0 0 1
sysinfo 2 0 0 0 0
env 53 0 0 0 0
rounding 112720 0 0 0 2
ranges 12109069 2 0 327755 7
mod2pi 80 0 0 0 0
euler 12 0 0 0 5
show X X X X X
errorshow X X X X X
sets 773 0 0 1 1
goto X X X X X
llvmcall X X X X X
llvmcall2 6 0 0 0 0
grisu 683 1 0 0 1
some 64 0 0 0 0
meta X X X X X
stacktraces X X X X X
docs X X X X X
misc X X X X X
threads X X X X X
enums 88 0 0 0 0
cmdlineargs X X X X X
int 10727 0 0 0 0
checked 1219 0 0 0 0
bitset 192 0 0 0 0
floatfuncs 134 0 0 0 1
boundscheck X X X X X
error 30 0 0 0 0
cartesian 7 0 0 0 0
osutils 42 0 0 0 0
channels X X X X X
iostream 6 0 2 0 0
secretbuffer 16 0 0 0 0
specificity X X X X X
reinterpretarray 118 0 0 0 1
syntax X X X X X
logging 117 2 0 0 0
missing 406 0 0 1 1
asyncmap 292 0 0 0 0
SparseArrays/higherorderfns 7000 79 0 73 7
SparseArrays/sparse 2184 0 0 0 19
SparseArrays/sparsevector 9921 0 0 0 5
Pkg/resolve 182 0 0 0 3
LinearAlgebra/triangular 33194 0 0 0 2
LinearAlgebra/qr 3120 0 0 0 1
LinearAlgebra/dense 7720 0 0 0 7
LinearAlgebra/matmul 711 0 0 0 3
LinearAlgebra/schur 390 0 0 0 1
LinearAlgebra/special 1068 0 0 0 3
LinearAlgebra/eigen 406 0 0 0 2
LinearAlgebra/bunchkaufman 5145 0 0 0 1
LinearAlgebra/svd 412 0 0 0 1
LinearAlgebra/lapack 778 2 0 0 3
LinearAlgebra/tridiag 1222 0 0 0 2
LinearAlgebra/bidiag 1946 0 0 0 1
LinearAlgebra/diagonal 1607 0 0 0 2
LinearAlgebra/cholesky 2194 0 0 0 1
LinearAlgebra/lu 1191 0 0 0 3
LinearAlgebra/symmetric 1982 0 0 0 1
LinearAlgebra/generic 430 0 0 0 3
LinearAlgebra/uniformscaling 338 0 0 0 0
LinearAlgebra/lq 1253 0 0 0 1
LinearAlgebra/hessenberg 40 0 0 0 0
LinearAlgebra/blas 628 0 0 0 1
LinearAlgebra/adjtrans 253 0 0 0 1
LinearAlgebra/pinv 288 0 0 0 1
LinearAlgebra/givens 1840 0 0 0 1
LinearAlgebra/structuredbroadcast 408 0 0 0 2
LibGit2/libgit2 219 2 59 1 0
Dates/accessors X X X X X
Dates/adjusters X X X X X
Dates/query 988 0 0 0 0
Dates/periods 681 0 0 0 0
Dates/ranges 349123 0 0 0 5
Dates/rounding 296 0 0 0 0
Dates/types 171 0 0 0 0
Dates/io 258 0 0 0 1
Dates/arithmetic 318 0 0 0 0
Dates/conversions 160 0 0 0 0
Base64 1015 0 0 0 1
CRC32c 658 0 6 0 0
DelimitedFiles 80 0 1 0 1
FileWatching X X X X X
Future 0 0 0 0 0
InteractiveUtils 104 3 2 0 4
Libdl X X X X X
Logging 35 1 0 0 1
Markdown 232 0 0 0 0
Mmap 131 0 0 0 1
Printf 701 38 0 0 0
Profile 10 0 0 0 2
REPL 990 0 0 5 0
Random 203081 4 0 0 7
SHA X X X X X
Serialization 105 1 1 0 0
Sockets X X X X X
Statistics 606 0 0 0 4
SuiteSparse 770 0 0 0 0
Test X X X X X
UUIDs 22 0 0 0 0
Unicode 752 0 0 0 0
@KristofferC
Copy link
Member

For + string/basic we get stuck at

ex = quote
            let b, n
                for T = (UInt8, Int8, UInt16, Int16, UInt32, Int32, UInt64, Int64, UInt128, Int128, BigInt), b = 2:62, _ = 1:10
                    n = if T != BigInt
                            rand(T)
                        else
                            BigInt(rand(Int128))
                        end
                end
            end
        end
end

I think we will have some problem finishing tests that do this type of combinatorial testing.

@timholy
Copy link
Member Author

timholy commented Feb 13, 2019

Thanks! Agreed this is an issue. Plans here: #12 (comment)

@timholy
Copy link
Member Author

timholy commented Feb 14, 2019

The numbers seem substantially better on a relatively recent build of Julia's master branch. I found that julia 1.0.x doesn't even finish, at least on an earlier version the script (though I didn't investigate why).

I also ran with a much higher nstmts threshold, so this is much more comprehensive than what I first posted. And it wasn't even that slow, maybe 10 minutes on an 8-worker run.

@timholy
Copy link
Member Author

timholy commented Feb 14, 2019

Now that #24 is merged, OP updated to provide helpful links about how to debug problems.

@KristofferC
Copy link
Member

KristofferC commented Feb 14, 2019

One worker for me seems to get stuck at something and consume a pretty hefty amount of memory
mem

I'll try figure out what expression it is getting stuck at.

@KristofferC
Copy link
Member

With the stdlibs now running, it seems the REPL tests deadlock. Perhaps this is something to do with the tasks in the REPL tests (which are notorious for deadlocking if something goes wrong). In this case the nstmts don't save us so perhaps another option is to give a timer to abort a test if it has been running too long?

@KristofferC
Copy link
Member

FWIW: Here is the result with running all stdlibs (except REPL)

Julia Version 1.2.0-DEV.321
Commit a03da7312e* (2019-02-14 05:43 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Maximum number of statements per lowered expression: 1000000

Test file Passes Fails Errors Broken Aborted blocks
SparseArrays/higherorderfns X X X X X
SparseArrays/sparse X X X X X
SparseArrays/sparsevector X X X X X
Pkg/resolve 128 0 0 0 1
LinearAlgebra/triangular 219 0 0 0 1
LinearAlgebra/qr 20 0 0 0 0
LinearAlgebra/dense 237 0 0 0 5
LinearAlgebra/matmul 283 0 0 0 4
LinearAlgebra/schur 5 0 0 0 0
LinearAlgebra/special 571 0 0 0 5
LinearAlgebra/eigen X X X X X
LinearAlgebra/bunchkaufman 14 0 0 0 0
LinearAlgebra/svd 16 0 0 0 0
LinearAlgebra/lapack 240 0 0 0 1
LinearAlgebra/tridiag 11 0 0 0 1
LinearAlgebra/bidiag 35 0 0 0 0
LinearAlgebra/diagonal 228 0 0 0 1
LinearAlgebra/cholesky 121 0 0 0 0
LinearAlgebra/lu 9 0 0 0 2
LinearAlgebra/symmetric 109 0 0 0 1
LinearAlgebra/generic 192 0 0 0 2
LinearAlgebra/uniformscaling 325 0 0 0 0
LinearAlgebra/lq 36 0 0 0 0
LinearAlgebra/hessenberg 3 0 0 0 0
LinearAlgebra/blas 167 0 0 0 0
LinearAlgebra/adjtrans X X X X X
LinearAlgebra/pinv 0 0 0 0 1
LinearAlgebra/givens 4 0 0 0 0
LinearAlgebra/structuredbroadcast 40 0 0 0 2
LibGit2/libgit2 X X X X X
Dates/accessors 13695 0 0 0 4
Dates/adjusters 2757 0 0 0 3
Dates/query 988 0 0 0 0
Dates/periods X X X X X
Dates/ranges 1609 0 0 0 5
Dates/rounding 198 0 0 0 0
Dates/types 122 0 0 0 0
Dates/io 77 0 0 0 0
Dates/arithmetic 287 0 0 0 0
Dates/conversions 66 0 0 0 0
Base64 X X X X X
CRC32c 652 0 0 0 0
DelimitedFiles X X X X X
FileWatching X X X X X
Future 0 0 0 0 0
InteractiveUtils X X X X X
Libdl 78 0 0 0 0
Logging 3 0 0 0 0
Markdown X X X X X
Mmap X X X X X
Printf 586 0 0 0 0
Profile 5 0 0 0 2
Random 1191 10 0 0 8
Serialization X X X X X
SHA X X X X X
Sockets 55 0 0 0 0
Statistics 288 0 0 0 3
SuiteSparse 778 0 0 0 0
Test X X X X X
Unicode 741 0 0 0 0
UUIDs 19 0 0 0 0

@timholy
Copy link
Member Author

timholy commented Feb 15, 2019

We are really getting somewhere! I am loving this collaboration.

@KristofferC
Copy link
Member

I took the liberty of updating the table with the latest stdlib run on master.

@timholy
Copy link
Member Author

timholy commented Feb 18, 2019

How should we handle "commentary"? For example, it looks like all failures in offsetarray are due to tests like @allocated(minimum!(R, B)) <= 400 or are a consequence of an Abort.

I'm still not sure why that test is shown as X.

@GunnarFarneback
Copy link

A number of tests, at least simdloop, vecelement, ccall, and llvmcall run into

ERROR: this intrinsic must be compiled to be called

Are there any plans for how to handle that?

@timholy
Copy link
Member Author

timholy commented Feb 19, 2019

Interesting, I hadn't gotten that far yet.

Given that there have already been quite a few bugs related to ccall, I'm beginning to wonder if we shouldn't change strategy. What would people think about handling this in optimize!: any time you encounter a foreigncall or llvmcall statement, you compile a #handle_foreigncall##327(args...) function. Then replace the foreigncall expression with something like

Expr(:foreigncall, QuoteValue(#handle_foreigncall##327), args_expressed_as_SSAValues_etc...)

Then we handle this expression in evaluate_foreigncall! we simply dispatch on it directly, rather than building up an Expr and calling Core.eval. EDIT: might want to mangle the lib and function symbols into the name, just so it's clearer for people inspecting these frames when debugging.

Some subtleties:

  • obviously this is not a canonical :foreigncall expression, but since all frames that get evaluated have passed through optimize!, then this seems like a valid abuse. Alternatively, we could make this a :call expression and somehow mark it to be run in Compiled mode.
  • I'm not sure, but I'm hoping this isn't so weird as to break things like find_used---this design deliberately hews to what's permitted in lowered AST, with the obvious exception of being completely unconventional for :foreigncall itself.
  • it might improve performance; that Core.eval is not cheap.

Any thoughts? This is just spitballing, I haven't tried any of this yet.

KristofferC pushed a commit that referenced this issue Feb 20, 2019
@timholy
Copy link
Member Author

timholy commented Feb 26, 2019

It seems all the "kills Julia" items have been resolved. There are, however, a bunch of (smallish?) issues that cause runtime errors in the tests (the X above). I've been cranking through a few of them, and a pattern is emerging: just blacklist the affected methods. You can either do that by module (see example in #75) or by method (#77, #72, #68).

Now I'm going to shift my attention to implementing breakpoints. If anyone else wants to join the fun, these are pretty easy fixes, if you just follow the instructions at the top and look at the models in those recent linked PRs. (For anything that looks like a Julia bug, it's also good to report that.)

EDIT: if I were to guess blindly, tests like core and subtype will be much harder to figure out than, say, dict, so be somewhat judicious in what you start with 😉.

@macd
Copy link
Contributor

macd commented Mar 4, 2019

Are you still looking for help here? I've got some cycles, but relatively inexperienced with this level of Julia details. I did notice that the calls to handle_err in test/utils.jl should be new_pc = handle_err(stack, frame, pc, err) instead of new_pc = handle_err(frame, err) ?? Ran the hashing module following instructions above (using today's master) most of the tests pass but saw the following:
mod = Main.JuliaTests
ex = quote
#= /home/macd/julia/test/hashing.jl:188 =#
let a = Expr(:block, Core.TypedSlot(1, Any)), b = Expr(:block, Core.TypedSlot(1, Any)), c = Expr(:block, Core.TypedSlot(3, Any))
#= /home/macd/julia/test/hashing.jl:191 =#
#= /home/macd/julia/test/hashing.jl:191 =# @test a == b && hash(a) == hash(b)
#= /home/macd/julia/test/hashing.jl:192 =#
#= /home/macd/julia/test/hashing.jl:192 =# @test a != c && hash(a) != hash(c)
#= /home/macd/julia/test/hashing.jl:193 =#
#= /home/macd/julia/test/hashing.jl:193 =# @test b != c && hash(b) != hash(c)
end
end
hashing: Error During Test at /home/macd/julia/test/hashing.jl:191
Test threw exception
Expression: a == b && hash(a) == hash(b)
syntax: Slot objects should not occur in an AST
hashing: Error During Test at /home/macd/julia/test/hashing.jl:192
Test threw exception
Expression: a != c && hash(a) != hash(c)
syntax: Slot objects should not occur in an AST
hashing: Error During Test at /home/macd/julia/test/hashing.jl:193
Test threw exception
Expression: b != c && hash(b) != hash(c)
syntax: Slot objects should not occur in an AST

@timholy
Copy link
Member Author

timholy commented Mar 5, 2019

Thanks @macd! I moved this to its own issue, #92.

Best to quote code using backticks, or user "test" will become annoyed (sorry).

@timholy
Copy link
Member Author

timholy commented Mar 12, 2019

We have started the 3-day release train. @KristofferC, @pfitzseb, and I are going to be quite busy getting dependent packages (Debugger.jl, LoweredCodeUtils.jl, Revise.jl, Rebugger.jl, and of course Juno itself) in shape. At the same time, a late-breaking change c5c3ec8 is a bit scary in terms of the potential implications for this issue. If any users want to pitch in to find out, we busy developers would be most grateful! It doesn't have to be perfect, but we do want to iron out the most obvious issues before it becomes fully registered.

CC some of our former heroes, @GunnarFarneback and @macd.

@macd
Copy link
Contributor

macd commented Mar 12, 2019

Wow, it looks like a lot has changed in such a short time! Just a note to others, you do need to rebuild the package after doing a git pull (maybe obvious, but I forgot)

@timholy
Copy link
Member Author

timholy commented Mar 12, 2019

Yes, it needs to be rebuilt. Not obvious. Manual management is needed only for people who have this on dev; once it becomes easier to have it on add, it should rebuild automatically every time a new version is released.

You also need to build separately for each different minor Julia release.

@KristofferC
Copy link
Member

KristofferC commented Mar 12, 2019

I wonder if the file should just be automatically generated if it doesn't exist during precompilation time instead of having this as a build step. If you add the package on 1.1 and then you add the same version on 1.0 it will not be rebuilt (packages are only built when they get downloaded) and error when loading.

The current build system doesn't work well when you need to build multiple times for different julia versions.

@timholy
Copy link
Member Author

timholy commented Mar 14, 2019

Oops, I had some broken stuff in test/utils.jl. Should be fixed now. The OP has also been updated to the new API (I hope).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants