Instrumentation markers for logging inference? #47902

timholy · 2022-12-14T22:24:51Z

TL;DR

The drive to minimize TTFX may benefit from the ability to send "instrumenting commands" to inference directly from Julia code.

For those who may remember a discussion (somehow I'm not finding it...) about the order in which compilation and timer instrumentation occur in @time, this has the same flavor.

Background

With native code caching (#47184), SnoopPrecompile, and the ability to fix invalidations, in principle we should often be able to drive "excess" TTFX (that beyond the workload itself) nearly to zero. So far, we're only rarely so successful: for example, #47889 seems to eliminate all invalidations that affect the demonstrated workloads, and yet the TTFX is more than a second for each. (It's milliseconds or less on the second execution of the workload.) While one can be happy about the gains we have, I was initially puzzled about why it's not at least an order of magnitude better.

Analysis of causes

From what I can tell, the answer appears to be a subtle interaction between the interpreter, scope, and tracking of inference. Here's the basic idea: currently SnoopPrecompile attempts to take a precompilation directive like this (a lightly modified version of CSV.jl's actual precompile file):

using SnoopPrecompile

@precompile_setup begin
    # Putting some things in `setup` can reduce the size of the
    # precompile file and potentially make loading faster.
    PRECOMPILE_DATA = "int,float,date,datetime,bool,null,str,catg,int_float\n1,3.14,2019-01-01,2019-01-01T01:02:03,true,,hey,abc,2\n2,NaN,2019-01-02,2019-01-03T01:02:03,false,,there,abc,3.14\n"
    @precompile_all_calls begin
        # all calls in this block will be precompiled, regardless of whether
        # they belong to your package or not (on Julia 1.8 and higher)
        CSV.File(IOBuffer(PRECOMPILE_DATA))
        CSV.File(Vector{UInt8}(PRECOMPILE_DATA))
        CSV.File(joinpath(pkgdir(CSV), "test", "testfiles", "precompile.csv"))
    end
end

and turn it into something like this:

if ccall(:jl_generating_output, Cint, ()) == 1       # run this only if precompiling
    let
        # USER SETUP CODE START
        PRECOMPILE_DATA = "int,float,date,datetime,bool,null,str,catg,int_float\n1,3.14,2019-01-01,2019-01-01T01:02:03,true,,hey,abc,2\n2,NaN,2019-01-02,2019-01-03T01:02:03,false,,there,abc,3.14\n"
        # USER SETUP CODE STOP
        Core.Compiler.Timings.reset_timings()       # this + next line start tracking everything that gets inferred
        Core.Compiler.__set_measure_typeinf(true)
        try
            # USER WORKLOAD START
            Base.Experimental.@force_compile        # this block *must* be passed to inference, the interpreter is not allowed to run it
            CSV.File(IOBuffer(PRECOMPILE_DATA))
            CSV.File(Vector{UInt8}(PRECOMPILE_DATA))
            CSV.File(joinpath(pkgdir(CSV), "test", "testfiles", "precompile.csv"))
            # USER WORKLOAD STOP
        finally
            Core.Compiler.__set_measure_typeinf(false)     # stop the instrumentation
            Core.Compiler.Timings.close_current_timer()
        end
        SnoopPrecompile.precompile_roots(Core.Compiler.Timings._timings[1].children)   # collect the list of things we inferred
    end
end

Seems sensible, right? But the problem is this: the @force_compile seems to end up forcing compilation of the entire contents of the let block, and that means some of the items get inferred before we turn on inference logging. That's OK for methods owned by the package, but if not it results in omission from the pkgimage. For this example, one of ~16 things that don't get cached is pkgdir(::Module); of course you can argue that I should have put that in the setup code, and that its omission is actually a good thing (I wouldn't disagree). But the more important point is that @precompile_all_calls is supposed to guarantee caching of everything inside, and it's definitely not living up to that promise; in some cases that hurts TTFX in undesirable ways.

Possible solutions

Separate toplevel expressions

If we didn't have to worry about setup code and hiding temporary variables inside the let, we could just have top-level expressions like

start_logging()
try
    @force_compile
    # user workload here
finally
    stop_logging()
end

But I think the ability to not pollute the module namespace and to have the option of not force-caching setup code is pretty desirable.

Use of `@eval`

Perhaps the next easiest is to turn the user workload into

    @eval begin
            Base.Experimental.@force_compile        # this block *must* be passed to inference, the interpreter is not allowed to run it
            CSV.File(IOBuffer($PRECOMPILE_DATA))
            CSV.File(Vector{UInt8}($PRECOMPILE_DATA))
            CSV.File(joinpath(pkgdir(CSV), "test", "testfiles", "precompile.csv"))
      end

but the need to escape the PRECOMPILE_DATA is annoying (presumably, package devs would have to do that manually, and it seems error-prone).

Instrumenting inference directly from user code

A final alternative is for us to have some special expressions, kind of like :meta works now, that essentially implements this logic within inference itself: that is, we basically send an expression to inference that toggles logging. In other words, it might translate into something like

using SnoopPrecompile

if ccall(:jl_generating_output, Cint, ()) == 1
    let
        PRECOMPILE_DATA = "int,float,date,datetime,bool,null,str,catg,int_float\n1,3.14,2019-01-01,2019-01-01T01:02:03,true,,hey,abc,2\n2,NaN,2019-01-02,2019-01-03T01:02:03,false,,there,abc,3.14\n"
        $(Expr(:meta, :track_inference, quote # this implies @force_compile for the block inside, but *not* for the containing `let` as a whole
            CSV.File(IOBuffer(PRECOMPILE_DATA))
            CSV.File(Vector{UInt8}(PRECOMPILE_DATA))
            CSV.File(joinpath(pkgdir(CSV), "test", "testfiles", "precompile.csv"))
        end)
        SnoopPrecompile.precompile_roots(Core.Compiler.Timings._timings[1].children)
    end
end

and Expr(:meta, :track_inference, expr_workload) is just a marker that gets processed during inference and triggers it to start logging before inferring expr_workload.

The text was updated successfully, but these errors were encountered:

Keno · 2022-12-15T00:17:06Z

Use of @eval

Use a closure with an invokelatest barrier? That seems to work around the primary scoping drawback here.

timholy · 2022-12-15T10:25:19Z

Great suggestion, thanks!

timholy · 2022-12-15T13:02:42Z

Hmm, turns out it had one issue: JuliaRegistries/General#74176 (can't define new methods or types inside the thunk).

timholy · 2022-12-15T14:14:33Z

It occurs that there's one more thing that favors the @eval solution: anyone who wants to precompile code executed by macros. Of course that only affects precompilation times, not TTFX, but as long as reducing the first doesn't hinder the second, it seems like a worthy goal.

timholy added compiler:inference Type inference compiler:latency Compiler latency julep Julia Enhancement Proposal labels Dec 14, 2022

timholy closed this as completed in timholy/SnoopCompile.jl@d11c578 Dec 15, 2022

timholy reopened this Dec 15, 2022

This was referenced Dec 15, 2022

latest version of SnoopPrecompile breaks RecipesBase timholy/SnoopCompile.jl#317

Closed

revert broken commit - add test timholy/SnoopCompile.jl#318

Merged

timholy mentioned this issue Dec 21, 2022

Re-inference where there shouldn't be #47960

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instrumentation markers for logging inference? #47902

Instrumentation markers for logging inference? #47902

timholy commented Dec 14, 2022

Keno commented Dec 15, 2022

Use of `@eval`

timholy commented Dec 15, 2022

timholy commented Dec 15, 2022 •

edited

Loading

timholy commented Dec 15, 2022

Instrumentation markers for logging inference? #47902

Instrumentation markers for logging inference? #47902

Comments

timholy commented Dec 14, 2022

TL;DR

Background

Analysis of causes

Possible solutions

Separate toplevel expressions

Use of @eval

Instrumenting inference directly from user code

Keno commented Dec 15, 2022

Use of @eval

timholy commented Dec 15, 2022

timholy commented Dec 15, 2022 • edited Loading

timholy commented Dec 15, 2022

Use of `@eval`

Use of `@eval`

timholy commented Dec 15, 2022 •

edited

Loading