-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Optimization] Cache contract generation and pre-compile some match expression #2013
Conversation
After the introduction of a proper node for types in the AST, we switched to a lazy way of generating contract: we keep the type as it is, and convert it to a contract only once it's actually applied. While simpler, this approach has the drawback of potentially wasting computations by running the contract generation code many times for the same contract, as some metrics showed (up to 1.6 thousand times on large codebases). This commit add a `contract` field to `Term::Type`, and statically generates - in the parser - the contract corresponding to a type. This is what we used to do before the introduction of a type node. Doing so, we only generate at most one contract per user annotation, regardless of how many times the contract is applied at runtime.
This commit saves some re-compilation of the match expression generated by contracts converted from enum types by generating the compiled version directly instead of a match.
9345420
to
9191bd6
Compare
What are the estimate performance gains from this PR? |
As a disclaimer, the improvement is hard to measure properly with the micro benchmarks (aka it doesn't show), in particular because we don't really benchmark using the same contract again and again (which does happen in real life as measurements show, probably because of merging, array contracts, etc.), and I feel a micro benchmark on that would be rather artificial. The only example where it can show measurable effects is a 120kLoC real code base, which takes around 20sec to evaluate, which starts to be long running enough to see some differences. I see a reduction of around 2% on several runs in user time (but system time, which is the predominant cost, is varying quite a bit, making the final result potentially less, at least on my very volatile laptop). Could be interesting to ask the original users, which are moreover using OsX, if they see something different in their setup. What motivated this PR originally is:
I intend to do a larger rework of the AST after the next (imminent) release of 1.8 anyway, so this change might not be very impactful on the long term. It's more of a "back to 1.7 perf" band-aid. However it's not hugely impactful even now, so I'm fine ditching it if others think it's not worth it. |
I'm going to merge this for 1.8, if that's fine by everyone. It's not fundamentally very difficult to revert, and I plan to rework quite a bit the AST and introduce multiple IRs for 1.9 anyway, so this change might become obsolete. |
This PR is guided by the metrics gathered for some run of Nickel on large codebases, which showed that some contracts were generated again and again, while coming from one initial annotation. While it's expected that an annotation might give raise to many runtime evaluations (a contract in the body of a function, typically), it's wasteful that we generate the code at runtime many times (contract generation is a form of compilation from a static type to a Nickel expression, and it shouldn't happen all the time during execution).
The issue is that we now store types directly in the AST and consider them weak head normal form. So, each time we apply such a contract,
contract/apply
get atyp::Type
, and need to make a contract out of it: generation happens again and again. This PR adds a new field toTerm::Type
,contract: RichTerm
, which is populated at parsing time with the contract generated from the type (which is close to the behavior before the introduction ofTerm::Type
). Doing so, we should only generate one contract for each user annotation in the code.Metrics indeed show that in some big examples, the worst case was generating ~1700 times the same contract, which is brought down to one. We also tried to closurize the generated contract, so that it doesn't have to be evaluated several time, but this showed no improvement whatsoever, while it increases the number of allocated thunks, so it's been reverted.
Additionally, in the same spirit of "generating more statically, and don't do it again at runtime", we precompile the match expression generated when converting an enum contract, so that it's not recompiled again and again each time the contract is applied. This showed a reduction of around 25 to 30% of the total number of compilation of match expressions on the private benchmark.
The concrete time again aren't spectacular, as it's saving mostly Rust function calls, and not Nickel, the latter being of course the dominant cost, but there is still a small improvement.