New `deriving` that generates O(n) code for PartialOrd, Ord, etc. #15503

pnkfelix · 2014-07-07T17:37:43Z

Instead of generating a separate case (albeit trivial) for each of the N*N cases when comparing two instances of an enum with N variants, this deriving uses the strategy outlined here: #15375 (comment)

In particular, it generates code that looks more like this:

    match (this, that, ...) {
      (Variant1, Variant1, Variant1) => ... // delegate Matching on Variant1
      (Variant2, Variant2, Variant2) => ... // delegate Matching on Variant2
      ...
      _ => {
        let index_tup = {
          let idx_this = match this { Variant1 => 0u, Variant2 => 1u, ... };
          let idx_that = match that { Variant1 => 0u, Variant2 => 1u, ... };
          ...
          (idx_this, idx_that, ...)
        };
        ... // delegate to catch-all; it can inspect `index_tup` for its logic
      }
    }

While adding a new variant to the const_nonmatching flag (and renaming it to on_nonmatching) to allow expressing the above (while still allowing one to opt back into the old O(N^2) and in general O(N^K) (where K is the number of self arguments) code generation behavior), I had two realizations:

Observation: Nothing except for the comparison derivings (PartialOrd, Ord, PartialEq, Eq) were even using the old O(N^K) code generator. So after this hypothetically lands, nothing needs to use them, and thus that code generation strategy could be removed, under the assumption that it is very unlikely that any deriving mode will actually need that level of generality.
Observation: The new code generator I am adding can actually be unified with all of the other code generators that just dispatch on the variant tag (they all assume that there is only one self argument).

These two observations mean that one can get rid of the const_nonmatching (aka on_nonmatching) entirely. So I did that too in this PR.

The question is: Do we actually want to follow through on both of the above observations? I'm pretty sure the second observation is a pure win. But there might be someone out there with an example that invalidates the reasoning in the first observation. That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the O(N^K) behavior. So, if that is true, then I can revise this PR to resurrect the on_nonmatching flag and provide a way to access the O(N^K) behavior.

The manner in which I choose to squash these commits during a post-review rebase depends on the answer to the above question.

Fix #15375.

pnkfelix · 2014-07-07T17:38:21Z

r? @huonw

lilyball · 2014-07-07T17:59:01Z

I am generally in favor of this kind of rewrite. But more concretely, do you have any information on what practical effect this has? The two metrics of interest are

Does this affect compile-time? I hope it speeds it up. Especially in the pathological 500-variant enum that @chris-morgan had issues with.
Does this affect runtime performance?

brson · 2014-07-07T18:20:54Z

Awesome effort, @pnkfelix.

sfackler · 2014-07-07T18:31:34Z

I wouldn't be too concerned about keeping the old O(n^k) code around just in case somebody may potentially want to use it someday. It'll still be in the history and we can resurrect it if needed, or it can get pulled out into a separate library.

pnkfelix · 2014-07-07T19:21:32Z

@kballard I'll address your questions in reverse order. :)

-2. Does this affect runtime performance?

It might. But only because it hasn't been quite as tightly optimized as it should be.

I was thinking of the particular case of C-like enums on the way home today.

Those cases are covered by this (not yet implemented) optimization: I really should be letting comparisons of two enums that both have no subparts (e.g. B and D in enum E { A(int), B, C(f64), D } fall through the catch-all as well, since the catch-all will have all of the information it needs to give the answers in such cases.

Once such an optimization is in place, then for C-like enums all of the cases would just fall through to the catch-all (i.e. there would be no preceding arms).

The other piece of the story there would be speeding up the mapping from the value to its discriminant index. Currently I am just emitting that code in place as another match statement, but in principle that could (and should?) be replaced by a compiler intrinsic, as discussed here: #15375 (comment)

Anyway, I haven't done any benchmarking yet of the generated code. I will try to whip some up in parallel with the rest of the review.

As for the compile times and code size measures, I have informally benchmarked that; I'll post those results in a follow-up comment.

pnkfelix · 2014-07-07T20:47:29Z

@kballard okay here is a gist with some semi-formal benchmark results.

https://gist.github.com/pnkfelix/3dcc567222d0906f1c42

The gist has both the program I used to generate the test data and the benchmarking driver script, but the important thing are the two "files" at the bottom, which are transcripts comparing my stage0 rustc behavior with my stage1 rustc behavior.

The main points to note are:

Regarding expanded code size (which is one thing that affects overall compile times): stage0 on line 28 shows that for an enum with 150 variants, the generated source code size is 2.6 megabytes. Compare that with stage1 on line 28, where the same 150 variant enum has a generated source code size of 148K.
Regarding compile times themselves, stage0 on line 55 shows that for that enum of 150 variants, compiling to object code took 11 seconds. The analogous result for stage1 on line 55 is 0.55 seconds.

Since we're talking about quadratic code size generation here, the problem note above only gets worse and worse as the number of variants in the enum grows. E.g for a 200 variant enum, the stage0 expanded code size I observe for this generator program is 4.6M, versus 200K for stage1.

(There is also a difference in the code size of the final output binary, which you can see at the gist as well, but I think the number of enum variants has to grow quite large, e.g. > 200 variants, before those get so drastically bad as what I'm reporting above for the expanded code size, presumably because much of the time that we are spending in compilation is LLVM getting the generated code back into something semi-reasonable. Still, the generated code for the new deriving is smaller when the number of variants is large, and I believe with the optimizations I outlined in my previous comment, it should compete favorably with our current deriving system even for enums with small numbers of variants.)

huonw · 2014-07-07T22:51:26Z

Those numbers look good @pnkfelix! (I will review later today.)

lilyball · 2014-07-07T23:25:10Z

@pnkfelix Awesome!

huonw · 2014-07-08T12:16:09Z

src/libsyntax/ext/deriving/generic/mod.rs

-    first tuple, for other1 are in the second tuple, etc.)
+    non-matching variants of the enum, but with all state hidden from
+    the consequent code.  The first component holds Idents for all of
+    the self arguments; the second component is a slice of all of the


idents of the Self arguments, right? (i.e. the Self type, not just the self value)

The comment is indeed meant to refer to all arguments of Self type, not just the self value.

I think I used lowercase self in some places because the pre-existing file sometimes used lowercase self, e.g. phrases like "nonself args" that are referring to all arguments of non Self type.

But I agree uppercase Self is clearer. I will at least update my own new/changed comments to use Self, and see if I can rephrase any other occurrences of self args as well.

huonw · 2014-07-08T13:07:13Z

That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the O(N^K) behavior

Thinking about it now, I don't think any generic deriving-like mode can do anything meaningful with the contained data when the variants don't match: there's rarely a non-trivial relationship between the fields e.g. types will only match by chance/in rare cases. So I'm 100% happy with the old code disappearing, the new code is so much nicer. (Far less O(N^K) recursion, while retaining the entirety of the generally useful information. ❤️)

pnkfelix · 2014-07-10T17:24:29Z

I still want to try to adopt this refactoring suggested by huon:

(Theoretically, I guess those two Vec could combined into one as a Vec<(Ident, Ident)> (or with a struct with informative field names, heh), since these tag-idents correspond directly to each Self arg ...)

After I do that then I'll have a go at pushing.

pnkfelix · 2014-07-11T14:20:34Z

Okay the refactoring @huonw suggested is probably doable, but doing it well (i.e. not just a local transformation, but rather propagating its good effects outward to the client code) requires wider revision to the code than I want to take on right now. So I am just going to rebase atop master and push.

…ent. In particular, I want authors of deriving modes to understand what they are opting into (namely quadratic code size or worse) when they select NonMatchesExplode.

In the above formulas, `n` is the number of variants, and `k` is the number of self-args fed into deriving. In the particular case of interest (namely `PartialOrd` and `Ord`), `k` is always 2, so we are basically comparing `O(n)` versus `O(n^2)`. Also, the stage is set for having *all* enum deriving codes go through `build_enum_match_tuple` and getting rid of `build_enum_match`. Also, seriously attempted to clean up the code itself. Added a bunch of comments attempting to document what I learned as I worked through the original code and adapted it to this new strategy.

Remove the `NonMatchesExplode` variant now that no deriving impl uses it. Removed `EnumNonMatching` entirely. Remove now irrelevant `on_matching` field and `HandleNonMatchingEnums` type. Removed unused `EnumNonMatchFunc` type def. Drive-by: revise `EnumNonMatchCollapsedFunc` doc. Made all calls to `expand_enum_method_body` go directly to `build_enum_match_tuple`. Alpha-rename `enum_nonmatch_g` back to `enum_nonmatch_f` to reduce overall diff noise. Inline sole call of `some_ordering_const`. Inline sole call of `ordering_const`. Removed a bunch of code that became dead after the above changes.

…r=huonw Instead of generating a separate case (albeit trivial) for each of the N*N cases when comparing two instances of an enum with N variants, this `deriving` uses the strategy outlined here: #15375 (comment) In particular, it generates code that looks more like this: ```rust match (this, that, ...) { (Variant1, Variant1, Variant1) => ... // delegate Matching on Variant1 (Variant2, Variant2, Variant2) => ... // delegate Matching on Variant2 ... _ => { let index_tup = { let idx_this = match this { Variant1 => 0u, Variant2 => 1u, ... }; let idx_that = match that { Variant1 => 0u, Variant2 => 1u, ... }; ... (idx_this, idx_that, ...) }; ... // delegate to catch-all; it can inspect `index_tup` for its logic } } ``` While adding a new variant to the `const_nonmatching` flag (and renaming it to `on_nonmatching`) to allow expressing the above (while still allowing one to opt back into the old `O(N^2)` and in general `O(N^K)` (where `K` is the number of self arguments) code generation behavior), I had two realizations: 1. Observation: Nothing except for the comparison derivings (`PartialOrd`, `Ord`, `PartialEq`, `Eq`) were even using the old `O(N^K)` code generator. So after this hypothetically lands, *nothing* needs to use them, and thus that code generation strategy could be removed, under the assumption that it is very unlikely that any `deriving` mode will actually need that level of generality. 2. Observation: The new code generator I am adding can actually be unified with all of the other code generators that just dispatch on the variant tag (they all assume that there is only one self argument). These two observations mean that one can get rid of the `const_nonmatching` (aka `on_nonmatching`) entirely. So I did that too in this PR. The question is: Do we actually want to follow through on both of the above observations? I'm pretty sure the second observation is a pure win. But there *might* be someone out there with an example that invalidates the reasoning in the first observation. That is, there might be a client out there with an example of hypothetical deriving mode that wants to opt into the `O(N^K)` behavior. So, if that is true, then I can revise this PR to resurrect the `on_nonmatching` flag and provide a way to access the `O(N^K)` behavior. The manner in which I choose to squash these commits during a post-review rebase depends on the answer to the above question. Fix #15375.

SCIP: Report the correct version of rust-analyzer in the metadata Previously this was hard coded to "0.1". The SCIP protocol allows this to be an arbitrary string: ``` message ToolInfo { // Name of the indexer that produced this index. string name = 1; // Version of the indexer that produced this index. string version = 2; // Command-line arguments that were used to invoke this indexer. repeated string arguments = 3; } ``` so use the same string reported by `rust-analyzer --version`.

huonw reviewed Jul 8, 2014
View reviewed changes

pnkfelix added 3 commits July 11, 2014 17:01

Revise the const_nonmatching flag with more info about author's int…

5d1bdc3

…ent. In particular, I want authors of deriving modes to understand what they are opting into (namely quadratic code size or worse) when they select NonMatchesExplode.

bors closed this Jul 11, 2014

bors merged commit 5cee578 into rust-lang:master Jul 11, 2014

sfackler mentioned this pull request Aug 6, 2014

Add intrinsic to extract the discriminant from an enum #16307

Closed

pnkfelix mentioned this pull request Aug 6, 2014

Bad performing code from #[deriving(Eq)] for enums #13623

Closed

huonw mentioned this pull request Jan 10, 2015

#[deriving(PartialEq, Show)] generates poor implementations for large enums #20856

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New `deriving` that generates O(n) code for PartialOrd, Ord, etc. #15503

New `deriving` that generates O(n) code for PartialOrd, Ord, etc. #15503

pnkfelix commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

lilyball commented Jul 7, 2014

brson commented Jul 7, 2014

sfackler commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

huonw commented Jul 7, 2014

lilyball commented Jul 7, 2014

huonw Jul 8, 2014

pnkfelix Jul 10, 2014

huonw commented Jul 8, 2014

pnkfelix commented Jul 10, 2014

pnkfelix commented Jul 11, 2014

New deriving that generates O(n) code for PartialOrd, Ord, etc. #15503

New deriving that generates O(n) code for PartialOrd, Ord, etc. #15503

Conversation

pnkfelix commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

lilyball commented Jul 7, 2014

brson commented Jul 7, 2014

sfackler commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

pnkfelix commented Jul 7, 2014

huonw commented Jul 7, 2014

lilyball commented Jul 7, 2014

huonw Jul 8, 2014

Choose a reason for hiding this comment

pnkfelix Jul 10, 2014

Choose a reason for hiding this comment

huonw commented Jul 8, 2014

pnkfelix commented Jul 10, 2014

pnkfelix commented Jul 11, 2014

New `deriving` that generates O(n) code for PartialOrd, Ord, etc. #15503

New `deriving` that generates O(n) code for PartialOrd, Ord, etc. #15503