Inference failure with 0.6 + StaticArrays hcat #21244

tkoolen · 2017-03-31T19:33:37Z

With StaticArrays latest master,

using StaticArrays

struct Foo
    bar::SMatrix{3, 3, Float64}
    Foo() = new(eye(SMatrix{3, 3, Float64}))
end

function f(foo::Foo)
    hcat(foo.bar[:, SVector(1, 2)], zeros(SMatrix{3, 3, Float64}))
end

foo = Foo()
f(foo)

sometimes results in

WARNING: An error occurred during inference. Type inference is now partially disabled.

signal (11): Segmentation fault
while loading /home/twan/code/RigidBodyDynamics/v0.6/RigidBodyDynamics/segfault.jl, in expression starting on line 13
_ZN4llvm17ScheduleDAGMILive19updatePressureDiffsENS_8ArrayRefINS_16RegisterMaskPairEEE at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
_ZN4llvm17ScheduleDAGMILive15initRegPressureEv at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
_ZN4llvm17ScheduleDAGMILive8scheduleEv at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
unknown function (ip: 0x7fd185333fd0)
unknown function (ip: 0x7fd18533c83f)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/twan/code/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
operator() at /home/twan/code/julia/src/jitlayers.cpp:442 [inlined]
_M_invoke at /usr/include/c++/5/functional:1857
operator() at /usr/include/c++/5/functional:2267 [inlined]
addModuleSet<llvm::SmallVector<std::unique_ptr<llvm::Module>, 1u>, llvm::RTDyldMemoryManager*, std::unique_ptr<llvm::orc::LambdaResolver<JuliaOJIT::addModule(std::unique_ptr<llvm::Module>)::<lambda(const string&)>, JuliaOJIT::addModule(std::unique_ptr<llvm::Module>)::<lambda(const string&)> >, std::default_delete<llvm::orc::LambdaResolver<JuliaOJIT::addModule(std::unique_ptr<llvm::Module>)::<lambda(const string&)>, JuliaOJIT::addModule(std::unique_ptr<llvm::Module>)::<lambda(const string&)> > > > > at /home/twan/code/julia/usr/include/llvm/ExecutionEngine/Orc/IRCompileLayer.h:73 [inlined]
addModule at /home/twan/code/julia/src/jitlayers.cpp:569
jl_add_to_ee at /home/twan/code/julia/src/jitlayers.cpp:792 [inlined]
jl_finalize_function at /home/twan/code/julia/src/jitlayers.cpp:803
getAddressForFunction at /home/twan/code/julia/src/codegen.cpp:1360
jl_generate_fptr at /home/twan/code/julia/src/codegen.cpp:1457
jl_compile_method_internal at /home/twan/code/julia/src/julia_internal.h:313 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:341 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
println at ./boot.jl:378 [inlined]
println at ./boot.jl:382
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
typeinf_loop at ./inference.jl:2657
typeinf_frame at ./inference.jl:2467
typeinf_code at ./inference.jl:2537
unknown function (ip: 0x7fd18128c5dd)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
typeinf_ext at ./inference.jl:2576
unknown function (ip: 0x7fd18126bfe2)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
jl_apply at /home/twan/code/julia/src/julia.h:1416 [inlined]
jl_type_infer at /home/twan/code/julia/src/gf.c:269
jl_compile_for_dispatch at /home/twan/code/julia/src/gf.c:1629
jl_compile_method_internal at /home/twan/code/julia/src/julia_internal.h:294 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:341 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
f at /home/twan/code/RigidBodyDynamics/v0.6/RigidBodyDynamics/segfault.jl:9
unknown function (ip: 0x7fcf53f60792)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
do_call at /home/twan/code/julia/src/interpreter.c:75
eval at /home/twan/code/julia/src/interpreter.c:242
jl_interpret_toplevel_expr at /home/twan/code/julia/src/interpreter.c:34
jl_toplevel_eval_flex at /home/twan/code/julia/src/toplevel.c:577
jl_parse_eval_all at /home/twan/code/julia/src/ast.c:873
jl_load at /home/twan/code/julia/src/toplevel.c:616
include_from_node1 at ./loading.jl:539
unknown function (ip: 0x7fd1813edf3b)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
include at ./sysimg.jl:14
unknown function (ip: 0x7fd18129167b)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
process_options at ./client.jl:305
_start at ./client.jl:371
unknown function (ip: 0x7fd18141b578)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
jl_apply at /home/twan/code/julia/ui/../src/julia.h:1416 [inlined]
true_main at /home/twan/code/julia/ui/repl.c:127
main at /home/twan/code/julia/ui/repl.c:264
__libc_start_main at /build/glibc-9tT8Do/glibc-2.23/csu/../csu/libc-start.c:291
unknown function (ip: 0x401668)
Allocations: 3910719 (Pool: 3909327; Big: 1392); GC: 5
Segmentation fault (core dumped)

other times in

WARNING: An error occurred during inference. Type inference is now partially disabled.
StackOverflowError()
forall_exists_subtype at /home/twan/code/julia/src/subtype.c:881
jl_subtype_env at /home/twan/code/julia/src/subtype.c:948
abstract_eval at ./inference.jl:1902
unknown function (ip: 0x7efefe0b0df6)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
copy! at ./abstractarray.jl:565
abstract_eval_call at ./inference.jl:1866
abstract_eval at ./inference.jl:1915
unknown function (ip: 0x7efefe0b0df6)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234
copy! at ./abstractarray.jl:565
abstract_eval_call at ./inference.jl:1866
abstract_eval at ./inference.jl:1915
unknown function (ip: 0x7efefe0b0df6)
jl_call_fptr_internal at /home/twan/code/julia/src/julia_internal.h:326 [inlined]
jl_call_method_internal at /home/twan/code/julia/src/julia_internal.h:345 [inlined]
jl_apply_generic at /home/twan/code/julia/src/gf.c:2234

<repeating>

versioninfo():

Julia Version 0.6.0-pre.alpha.329
Commit cded1db (2017-03-31 00:28 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6950X CPU @ 3.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

StaticArrays SHA: bd3236ac39372aa84891bea066c71d63559af29e

cc: @andyferris

The text was updated successfully, but these errors were encountered:

Keno · 2017-03-31T19:55:54Z

StaticArrays is doing bad things with its _index_shape method. I had that fixed locally, but haven't put it up yet.

tkoolen · 2017-03-31T22:16:27Z

I don't see an _index_shape method in StaticArrays master or 0.4.0, but I look forward to a PR!

Keno · 2017-03-31T22:17:27Z

I misremembered the name: https://github.com/JuliaArrays/StaticArrays.jl/blob/master/src/indexing.jl#L55-L66

andyferris · 2017-04-01T05:24:52Z

I'm curious - what is the problem with these methods?

Keno · 2017-04-01T05:26:20Z

Your @pure increment bypasses inference's termination heuristics and you have no terminating case. They can be written either using a decrement or over the tuple argument.

martinholters · 2017-04-03T06:51:29Z

To expand a bit: Inference ensures termination by limiting the complexity of types in terms of nesting depth, tuple length, etc. But if the argument to index_sizes cannot be inferred, then there is no termination criterion for the recursion of _index_sizes, and the type that changes with every recursion is Val{1}, Val{2}, ... which stays at very low complexity. Hence, inference tries to infer a quasi-infinite number of specializations for _index_sizes, and due to the recursive nature of inference, the result is a stack overflow. (I hope I got that right?)

But while for this particular case a fix to StaticArrays seems in order, we might want to keep this issue open as a reminder that the inference termination heuristics are far from perfect. (Or do we have a dedicated issue for that?)

andyferris · 2017-04-03T11:44:23Z

While increment does indeed "continue forever" (it's a generator for the natural numbers), used in conjunction with the _index_sizes method it seems it could only possibly iterate a small number of times (i.e. the number of input arguments to _input_sizes - there is a terminating case for _input_sizes whenever the varargs runs out).

In the exceedingly rare case that this is a large number of input arguments, then the varargs would be so long that inference will give up and widen, and it would call increment dynamically.

Sorry, I'm just a bit confused what the problem is here, and what the solution is. Do I have a bug that makes _index_sizes non-terminating? Is inference getting confused about something that it shouldn't be confused about?

martinholters · 2017-04-03T13:12:41Z

While _index_sizes is indeed terminating, inference might fail to find the exact type of the inds parameter, because the call-site splats something of unknown (or un-inferable) length. Say inds is inferred as Tuple, then, as its first element might be an Int, inference of _index_sizes(Val{1}, (), inds...) needs to consider the method _index_sizes(v::Type{Val{N}}, x::Tuple, ::Int, inds...) specialized as _index_sizes(v::Type{Val{1}}, x::Tuple{}, ::Int, inds...) with inds::Tuple again. That, in turn, needs to consider (among others) the same method again, specialized as _index_sizes(v::Type{Val{2}}, x::Tuple{Size}, ::Int, inds...) with inds::Tuple once more. This iterates, and while the type of x will at some point be limited to Tuple (or maybe Tuple{Vararg{Size}}, not 100% sure), the Val parameter will just continue to be increased, leading to different specializations every time.

Oh, I should mention that I haven't checked whether this is the actual cause of the OP problem.

vtjnash · 2017-04-03T13:25:21Z

Yes. At this time, it is assumed that an @pure function will never be called in a recursive context. While not invalid, we currently do not have the logic in place in inference to detect this situation and abort inference on the recursion (before it may sometimes cause the crash observed here).

martinholters · 2017-04-03T13:34:20Z

Ok, confirmed insofar as replacing all _index_sizes methods with the specializations needed for the OP case

@inline _index_sizes(s::Size, v::Type{Val{1}}, x::Tuple, a::Colon, inds...) = _index_sizes(s, increment(v), (x..., Size(s[1])), inds...)
@inline _index_sizes(s::Size, v::Type{Val{2}}, x::Tuple, a::StaticArray, inds...) = _index_sizes(s, increment(v), (x..., Size(a)), inds...)
@inline _index_sizes(s::Size, ::Type{Val{3}}, x::Tuple) = x

fixes it.

So let me ask again, should we keep this issue open as a reminder that the inference termination heuristics are far from perfect, or do we have a dedicated issue for that?

Wokraround for JuliaLang/julia#21244.

martinholters · 2017-04-07T14:28:59Z

Anticipating this to be fixed in StaticArrays, here is a self-contained, somewhat reduced repro of the underlying problem for future reference:

immutable Size{s}; end

Base.@pure increment(::Type{Val{N}}) where {N} = Val{N+1}

@inline index_sizes(s::Size, inds...) = _index_sizes(s, Val{1}, (), inds...)
@inline _index_sizes(s::Size, ::Type{Val{N}}, x::Tuple) where {N} = x
@inline _index_sizes(s::Size, v::Type{Val{N}}, x::Tuple, ::Int, inds...) where {N} = _index_sizes(s, increment(v), (x..., Size()), inds...)
@inline _index_sizes(s::Size, v::Type{Val{N}}, x::Tuple, a::Colon, inds...) where {N} = _index_sizes(s, increment(v), (x..., Size(s[N])), inds...)

code_warntype(index_sizes, Tuple{Vararg{Any}})

martinholters · 2017-04-07T14:51:13Z

Working at fixing it in StaticArrays, I found another case that is also quite entertaining; reduced, self-contained repro:

immutable Size{s}; end

# EDIT: The @pure here is not necessary to trigger the problem
#=Base.@pure=# tail(::Type{Size{S}}) where {S} = Size{Base.tail(S)} 
@inline tail(::S) where {S<:Size} = tail(S)()

@inline index_sizes(s::Size, inds...) = _index_sizes(s, (), inds...)
@inline _index_sizes(s::Size, x::Tuple) = x
@inline _index_sizes(s::Size, x::Tuple, ::Int, inds...) = _index_sizes(tail(s), (x..., Size()), inds...)
@inline _index_sizes(s::Size, x::Tuple, a::Colon, inds...) = _index_sizes(tail(s), (x..., Size(s[1])), inds...)

for n in 1:16
    t = @elapsed code_warntype(DevNull, index_sizes, NTuple{n, Any})
    println("$n: $t")
end
@elapsed code_warntype(DevNull, index_sizes, Tuple{Vararg{Any}})

Exponential growth in run-time, similar to #20848 (comment), but even more pronounced.

vtjnash · 2017-04-25T19:34:10Z

We will need to refactor the inference code somewhat substantially to be able to track this information correctly. For now, I simply recommend avoiding marking functions as @pure whenever possible (the inference fix for this example will necessarily be to ignore the @pure declaration anyways).

martinholters · 2017-04-26T07:23:10Z

For the OP case, where the recursion is infinitely deep, active will hold the same method a great many times, which could be detected to bail out, e.g. like so:

--- a/base/inference.jl
+++ b/base/inference.jl
@@ -1282,6 +1282,18 @@ function abstract_call_gf_by_type(f::ANY, atype::ANY, sv::InferenceState)
             fullmatch = true
         end
 
+        method_active_count = 0
+        for infstate in active
+            infstate === nothing && continue
+            infstate = infstate::InferenceState
+            if isdefined(infstate.linfo, :def) && method === infstate.linfo.def
+                method_active_count += 1
+            end
+        end
+        if method_active_count > 200 # or whatever limit makes most sense
+            sig = method.sig
+        end
+
         # limit argument type tuple growth
         msig = unwrap_unionall(m[3].sig)
         lsig = length(msig.parameters)

This fixes the case from #21244 (comment) (which should be equivalent to the OP case), and does not seem to have any adverse effects on first sight. I don't know how kosher this is, as for recursion looping through multiple functions/methods, it will depend on the entry function where this heuristic kicks in. Might be better than nothing, still?

martinholters · 2017-04-26T07:29:21Z

A much tougher problem seems to be a bifurcating recursion, e.g. the following non-sensical code:

function foo(y, ::Any, xs...)
    foo((y..., Val{0}()), xs...)
    foo((y..., Val{1}()), xs...)
    foo((y..., Val{2}()), xs...)
    foo((y..., Val{3}()), xs...)
    foo((y..., Val{4}()), xs...)
    foo((y..., Val{5}()), xs...)
    foo((y..., Val{6}()), xs...)
    foo((y..., Val{7}()), xs...)
    # add more for even greater fun
end
foo(y) = y

Here, code_warntype(foo, Tuple{Tuple{}, Vararg{Any,3}}) takes 2.8 seconds on my system, while code_warntype(foo, Tuple{Tuple{}, Vararg{Any,4}}) needs 212 seconds, and I didn't dare trying higher values.

I think for cases like this, we would indeed need to track how often a certain method has been inferred (in the current inference run) to figure out that something funny is going on.

tkoolen · 2017-09-24T17:03:21Z

For reference, the lines in StaticArrays @Keno was referring to are
https://github.com/JuliaArrays/StaticArrays.jl/blob/bd3236ac39372aa84891bea066c71d63559af29e/src/indexing.jl#L55-L66

(Keno's link was referring to master, which has since changed).

This set was introduced to test an inference issue (JuliaLang/julia#21244) in an old version of Julia. It currently fails on Julia master because of JuliaLang/julia#27276. In my opinion, it's currently not really testing anything valuable, and the Julia PR that fixed the issue has associated test cases that should cover this.

andyferris mentioned this issue Apr 3, 2017

WIP Add size to abstract StaticArray type JuliaArrays/StaticArrays.jl#134

Merged

JeffBezanson added the bug Indicates an unexpected problem or unintended behavior label Apr 3, 2017

JeffBezanson added this to the 0.6.x milestone Apr 4, 2017

martinholters added a commit to martinholters/StaticArrays.jl that referenced this issue Apr 7, 2017

Workaround for type inference problem in index_sizes

81c0a22

Wokraround for JuliaLang/julia#21244.

martinholters mentioned this issue Apr 7, 2017

Workaround for type inference problem in index_sizes JuliaArrays/StaticArrays.jl#139

Merged

tkoolen mentioned this issue Apr 11, 2017

Julia 0.6 support issue tracker JuliaRobotics/RigidBodyDynamics.jl#203

Closed

7 tasks

vtjnash mentioned this issue Apr 12, 2017

(method too new to be called from this world context.) #21356

Closed

vtjnash modified the milestones: 0.6.x, 1.0 Apr 25, 2017

vtjnash mentioned this issue May 18, 2017

inference: improve cycle detection #21933

Merged

martinholters closed this as completed in #21933 Jun 13, 2017

tkoolen mentioned this issue Sep 24, 2017

Segmentation fault JuliaRobotics/RigidBodyDynamics.jl#346

Closed

tkoolen mentioned this issue May 29, 2018

Remove "inferabilty of index_sizes helper" testset JuliaArrays/StaticArrays.jl#421

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference failure with 0.6 + StaticArrays hcat #21244

Inference failure with 0.6 + StaticArrays hcat #21244

tkoolen commented Mar 31, 2017 •

edited

Loading

Keno commented Mar 31, 2017

tkoolen commented Mar 31, 2017

Keno commented Mar 31, 2017

andyferris commented Apr 1, 2017

Keno commented Apr 1, 2017

martinholters commented Apr 3, 2017

andyferris commented Apr 3, 2017

martinholters commented Apr 3, 2017

vtjnash commented Apr 3, 2017

martinholters commented Apr 3, 2017

martinholters commented Apr 7, 2017

martinholters commented Apr 7, 2017 •

edited

Loading

vtjnash commented Apr 25, 2017

martinholters commented Apr 26, 2017

martinholters commented Apr 26, 2017

tkoolen commented Sep 24, 2017

Inference failure with 0.6 + StaticArrays hcat #21244

Inference failure with 0.6 + StaticArrays hcat #21244

Comments

tkoolen commented Mar 31, 2017 • edited Loading

Keno commented Mar 31, 2017

tkoolen commented Mar 31, 2017

Keno commented Mar 31, 2017

andyferris commented Apr 1, 2017

Keno commented Apr 1, 2017

martinholters commented Apr 3, 2017

andyferris commented Apr 3, 2017

martinholters commented Apr 3, 2017

vtjnash commented Apr 3, 2017

martinholters commented Apr 3, 2017

martinholters commented Apr 7, 2017

martinholters commented Apr 7, 2017 • edited Loading

vtjnash commented Apr 25, 2017

martinholters commented Apr 26, 2017

martinholters commented Apr 26, 2017

tkoolen commented Sep 24, 2017

tkoolen commented Mar 31, 2017 •

edited

Loading

martinholters commented Apr 7, 2017 •

edited

Loading