-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New deriving
that generates O(n) code for PartialOrd, Ord, etc.
#15503
New deriving
that generates O(n) code for PartialOrd, Ord, etc.
#15503
Conversation
r? @huonw |
I am generally in favor of this kind of rewrite. But more concretely, do you have any information on what practical effect this has? The two metrics of interest are
|
Awesome effort, @pnkfelix. |
I wouldn't be too concerned about keeping the old |
@kballard I'll address your questions in reverse order. :)
It might. But only because it hasn't been quite as tightly optimized as it should be. I was thinking of the particular case of C-like enums on the way home today. Those cases are covered by this (not yet implemented) optimization: I really should be letting comparisons of two enums that both have no subparts (e.g. Once such an optimization is in place, then for C-like enums all of the cases would just fall through to the catch-all (i.e. there would be no preceding arms). The other piece of the story there would be speeding up the mapping from the value to its discriminant index. Currently I am just emitting that code in place as another match statement, but in principle that could (and should?) be replaced by a compiler intrinsic, as discussed here: #15375 (comment) Anyway, I haven't done any benchmarking yet of the generated code. I will try to whip some up in parallel with the rest of the review. As for the compile times and code size measures, I have informally benchmarked that; I'll post those results in a follow-up comment. |
@kballard okay here is a gist with some semi-formal benchmark results. https://gist.github.com/pnkfelix/3dcc567222d0906f1c42 The gist has both the program I used to generate the test data and the benchmarking driver script, but the important thing are the two "files" at the bottom, which are transcripts comparing my stage0 rustc behavior with my stage1 rustc behavior. The main points to note are:
Since we're talking about quadratic code size generation here, the problem note above only gets worse and worse as the number of variants in the enum grows. E.g for a 200 variant enum, the stage0 expanded code size I observe for this generator program is 4.6M, versus 200K for stage1. (There is also a difference in the code size of the final output binary, which you can see at the gist as well, but I think the number of enum variants has to grow quite large, e.g. > 200 variants, before those get so drastically bad as what I'm reporting above for the expanded code size, presumably because much of the time that we are spending in compilation is LLVM getting the generated code back into something semi-reasonable. Still, the generated code for the new deriving is smaller when the number of variants is large, and I believe with the optimizations I outlined in my previous comment, it should compete favorably with our current deriving system even for enums with small numbers of variants.) |
Those numbers look good @pnkfelix! (I will review later today.) |
@pnkfelix Awesome! |
first tuple, for other1 are in the second tuple, etc.) | ||
non-matching variants of the enum, but with all state hidden from | ||
the consequent code. The first component holds Idents for all of | ||
the self arguments; the second component is a slice of all of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idents of the Self
arguments, right? (i.e. the Self type, not just the self value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is indeed meant to refer to all arguments of Self type, not just the self
value.
I think I used lowercase self
in some places because the pre-existing file sometimes used lowercase self, e.g. phrases like "nonself args" that are referring to all arguments of non Self type.
But I agree uppercase Self
is clearer. I will at least update my own new/changed comments to use Self
, and see if I can rephrase any other occurrences of self args
as well.
Thinking about it now, I don't think any generic |
I still want to try to adopt this refactoring suggested by huon:
After I do that then I'll have a go at pushing. |
Okay the refactoring @huonw suggested is probably doable, but doing it well (i.e. not just a local transformation, but rather propagating its good effects outward to the client code) requires wider revision to the code than I want to take on right now. So I am just going to rebase atop master and push. |
…ent. In particular, I want authors of deriving modes to understand what they are opting into (namely quadratic code size or worse) when they select NonMatchesExplode.
In the above formulas, `n` is the number of variants, and `k` is the number of self-args fed into deriving. In the particular case of interest (namely `PartialOrd` and `Ord`), `k` is always 2, so we are basically comparing `O(n)` versus `O(n^2)`. Also, the stage is set for having *all* enum deriving codes go through `build_enum_match_tuple` and getting rid of `build_enum_match`. Also, seriously attempted to clean up the code itself. Added a bunch of comments attempting to document what I learned as I worked through the original code and adapted it to this new strategy.
Remove the `NonMatchesExplode` variant now that no deriving impl uses it. Removed `EnumNonMatching` entirely. Remove now irrelevant `on_matching` field and `HandleNonMatchingEnums` type. Removed unused `EnumNonMatchFunc` type def. Drive-by: revise `EnumNonMatchCollapsedFunc` doc. Made all calls to `expand_enum_method_body` go directly to `build_enum_match_tuple`. Alpha-rename `enum_nonmatch_g` back to `enum_nonmatch_f` to reduce overall diff noise. Inline sole call of `some_ordering_const`. Inline sole call of `ordering_const`. Removed a bunch of code that became dead after the above changes.
…r=huonw Instead of generating a separate case (albeit trivial) for each of the N*N cases when comparing two instances of an enum with N variants, this `deriving` uses the strategy outlined here: #15375 (comment) In particular, it generates code that looks more like this: ```rust match (this, that, ...) { (Variant1, Variant1, Variant1) => ... // delegate Matching on Variant1 (Variant2, Variant2, Variant2) => ... // delegate Matching on Variant2 ... _ => { let index_tup = { let idx_this = match this { Variant1 => 0u, Variant2 => 1u, ... }; let idx_that = match that { Variant1 => 0u, Variant2 => 1u, ... }; ... (idx_this, idx_that, ...) }; ... // delegate to catch-all; it can inspect `index_tup` for its logic } } ``` While adding a new variant to the `const_nonmatching` flag (and renaming it to `on_nonmatching`) to allow expressing the above (while still allowing one to opt back into the old `O(N^2)` and in general `O(N^K)` (where `K` is the number of self arguments) code generation behavior), I had two realizations: 1. Observation: Nothing except for the comparison derivings (`PartialOrd`, `Ord`, `PartialEq`, `Eq`) were even using the old `O(N^K)` code generator. So after this hypothetically lands, *nothing* needs to use them, and thus that code generation strategy could be removed, under the assumption that it is very unlikely that any `deriving` mode will actually need that level of generality. 2. Observation: The new code generator I am adding can actually be unified with all of the other code generators that just dispatch on the variant tag (they all assume that there is only one self argument). These two observations mean that one can get rid of the `const_nonmatching` (aka `on_nonmatching`) entirely. So I did that too in this PR. The question is: Do we actually want to follow through on both of the above observations? I'm pretty sure the second observation is a pure win. But there *might* be someone out there with an example that invalidates the reasoning in the first observation. That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the `O(N^K)` behavior. So, if that is true, then I can revise this PR to resurrect the `on_nonmatching` flag and provide a way to access the `O(N^K)` behavior. The manner in which I choose to squash these commits during a post-review rebase depends on the answer to the above question. Fix #15375.
SCIP: Report the correct version of rust-analyzer in the metadata Previously this was hard coded to "0.1". The SCIP protocol allows this to be an arbitrary string: ``` message ToolInfo { // Name of the indexer that produced this index. string name = 1; // Version of the indexer that produced this index. string version = 2; // Command-line arguments that were used to invoke this indexer. repeated string arguments = 3; } ``` so use the same string reported by `rust-analyzer --version`.
Instead of generating a separate case (albeit trivial) for each of the N*N cases when comparing two instances of an enum with N variants, this
deriving
uses the strategy outlined here: #15375 (comment)In particular, it generates code that looks more like this:
While adding a new variant to the
const_nonmatching
flag (and renaming it toon_nonmatching
) to allow expressing the above (while still allowing one to opt back into the oldO(N^2)
and in generalO(N^K)
(whereK
is the number of self arguments) code generation behavior), I had two realizations:PartialOrd
,Ord
,PartialEq
,Eq
) were even using the oldO(N^K)
code generator. So after this hypothetically lands, nothing needs to use them, and thus that code generation strategy could be removed, under the assumption that it is very unlikely that anyderiving
mode will actually need that level of generality.These two observations mean that one can get rid of the
const_nonmatching
(akaon_nonmatching
) entirely. So I did that too in this PR.The question is: Do we actually want to follow through on both of the above observations? I'm pretty sure the second observation is a pure win. But there might be someone out there with an example that invalidates the reasoning in the first observation. That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the
O(N^K)
behavior. So, if that is true, then I can revise this PR to resurrect theon_nonmatching
flag and provide a way to access theO(N^K)
behavior.The manner in which I choose to squash these commits during a post-review rebase depends on the answer to the above question.
Fix #15375.