-
-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SnoopPrecompile
#2441
Use SnoopPrecompile
#2441
Conversation
Try this Pull Request!Open Julia and type: julia> import Pkg
julia> Pkg.activate(temp=true)
julia> Pkg.add(url="https://github.com/fonsp/Pluto.jl", rev="rh/snoopprecompile")
julia> using Pluto |
src/precompile.jl
Outdated
) | ||
end | ||
expr = Expr(:toplevel, :(1 + 1)) | ||
Pluto.PlutoRunner.run_expression(__Foo, expr, __TEST_NOTEBOOK_ID, uuid1(), nothing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this precompilation could benefit notebook processes too (it is only called for simple markdown cells on the Pluto process), maybe with #1881 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried to add the following to src/precompile.jl
:
session = Pluto.ServerSession()
session.options.evaluation.workspace_use_distributed = false
basic = joinpath(pkgdir(Pluto), "sample", "Basic.jl")
nb = load_notebook(basic)
Pluto.WorkspaceManager.make_workspace((session, nb))
This grinds my PC to a halt. Even Firefox becomes unresponsive even though the compilations occurs on only one thread. I suspect that the calls to Distributed
or some of the other async calls are the problem.
The benchmark after 86de0d4 looks as follows:
It looks like reducing the time for |
Thanks @rikhuijzer ! Awesome, this is giving a 2x speedup in loading times? 😮 About the balance: in the past, we always tried to do as much loading as possible during How much compilation time can be moved from |
The most aggressive compilation is by using
So, compared to before this PR, this reduces the compilation by about (1.54 GiB - 1.44 GiB) / 1.54 GiB = 6%. I'm using allocations as an estimate here because the running time is affected by how quickly the packages are updated. Maybe this is good enough for now? If we really want to reduce compilation, we should go to |
src/precompile.jl
Outdated
basic = joinpath(pkgdir(Pluto), "sample", "Basic.jl") | ||
nb = load_notebook(basic) | ||
|
||
# Compiling PlutoRunner is very beneficial because it saves time in each notebook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But PlutoRunner is used by the notebook process, not the server process. Are we still getting a benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need #1881 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But PlutoRunner is used by the notebook process, not the server process. Are we still getting a benefit?
Ah! Because Pluto
is not loaded in the notebook process, so the precompilation cache is also not loaded?
Yes maybe we need #1881 then indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it looks like precompiling PlutoRunner
has no effect currently.
This is in the server process:
julia> using Pluto
julia> @time @eval Pluto.PlutoRunner.show_richest(IOBuffer(), (; n=1));
0.009845 seconds (2.36 k allocations: 139.334 KiB, 85.56% compilation time)
and then after starting Pluto and running a new notebook with:
PlutoRunner.show_richest(IOBuffer(), (; n=1))
takes more than 1 seconds on the first run.
Awesome! I'm calling with Rik after compilation time this afternoon at 3pm CET if anyone wants to join! |
I made a notebook that shows how to trigger lots of Pluto's functionality without starting a process or opening a socket. https://htmlview.glitch.me/?https://gist.github.com/fonsp/e005fa2c11d9dc92d21bfbc6f3780394 |
Very nice!! Really nice how you avoided some IO while still hitting lots of code. I've added the suggestions in 98ec719 and ran the benchmark with Julia 1.9.0-beta3. First another run based on
Then, with the stuff that I wrote (commit 13de63f):
and then including what you suggested (commit 98ec719):
Generally, it looks like the switch from 1.9.0-beta2 to 1.9.0-beta3 reduced the I've also benchmarked Julia 1.8.5. This is on
and this is this PR (commit 98ec719):
I didn't test 1.6 because, well, if you use that you will be waiting on Julia anyway. Overall, it looks like this PR is a slight improvement thanks to |
Hmmmm, I wanted to measure how long precompilation takes, and I found out that this takes 4x longer on 1.9 😭 roughly 40 seconds vs 10 seconds. Pluto
|
I removed the Julia 1.8.3, no precompile help at all
Julia 1.9.0-beta3, no precompile help at all
|
Should we open an issue at Julia? Before 1.9 gets released 😬 |
I just tested merging #1881 into this PR, but it has no effect on these benchmarks. :( |
That makes sense. These benchmarks do not test how responsive the notebook process ( |
Exactly, I think we should add some things to the benchmarks |
Now that caching of LLVM is available in Julia 1.9,
PrecompileSignatures.jl
is likely not the most useful anymore. It's better to run code in aSnoopPrecompile.@precompile_all_calls
block what this PR does.This used to be the compilation time when running the tests with
ENV["PLUTO_TEST_ONLY_COMPILETIMES"] = true
on Julia 1.9.0-beta2:After removing
PrecompileSignatures.jl
and addingSnoopPrecompile.@precompile_all_calls
(this PR), the benchmark shows:From this we can conclude that,
import Pluto
is way faster for some reason andSessionActions.open
is way slower. Notice especially thatPlutoRunner.run_expression
is way faster. So the potential is there if only we can avoid side-effects. Overall,SnoopPrecompile
should make things faster.SnoopPrecompile
is fundamentally better thanPrecompileSignatures
because it ensures that all functions in the call can be inferred and compiled whereas the inference inPrecompileSignatures
will give up at some point which, in turn, causes not everything to be compiled.In general, the TTFX can probably be reduced further by finding things to call during the
SnoopPrecompile.@precompile_all_calls
phase. Finding these things is very tricky because it calling exactly the right code which (1) compiles as much as possible (2) doesn't cause unwanted side-effects such as network requests or file writes. I've tried to run a full server inside the precompilation phase but that caused a SIGKILL on my 16 GB RAM machine, so we probably have to be very careful with that.