-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inlined symbols #74554
Inlined symbols #74554
Conversation
I was getting some slightly odd measurements for local runs, so lets see what CI says. @bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit f1c885b0d6ee42b0a966bcf7702f04a062751f75 with merge 0d5bd95f53618d3e3f1de22edfc7a99bc144ccff... |
src/librustc_span/symbol.rs
Outdated
let n = if len == 4 && s[3] != 0 && s[3] < 0x80 { | ||
s[0] as u32 | ((s[1] as u32) << 8) | ((s[2] as u32) << 16) | ((s[3] as u32) << 24) | ||
} else if len == 3 && s[2] != 0 { | ||
s[0] as u32 | ((s[1] as u32) << 8) | ((s[2] as u32) << 16) | ||
} else if len == 2 && s[1] != 0 { | ||
s[0] as u32 | ((s[1] as u32) << 8) | ||
} else if len == 1 && s[0] != 0 { | ||
s[0] as u32 | ||
} else if len == 0 { | ||
0u32 | ||
} else { | ||
return None; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrite to a match is nicer.
☀️ Try build successful - checks-actions, checks-azure |
Queued 0d5bd95f53618d3e3f1de22edfc7a99bc144ccff with parent 05630b0, future comparison URL. |
With inlined storage, does it potentially make sense to try making this value a |
Finished benchmarking try commit (0d5bd95f53618d3e3f1de22edfc7a99bc144ccff): comparison url. |
Going from |
The performance results look decent but the biggest improvements all accrue to the shortest-running benchmarks, which are mostly artificial. By the time we get to real programs, such as @rust-lang/wg-compiler-performance: any thoughts about the whether this perf improvement is worth the additional complexity? (The big new comment at the top of |
Another possibility is to try packing more into the u32. If we constrain ourselves to something like That's a much more complex encoding, and playing with alternatives (e.g., including upper case letters and using 6 bits still lets us pack 5 characters) may be interesting too. I suspect though that any scheme like this is unlikely to merit significant performance wins; it's only really useful during parsing when we're actively interning lots of strings, right? In that case it might be worth trying to eliminate the lock (or take it and stash the guard into e.g. ParseSess or something, potentially). Do you have a table of common length strings? Would it be worthwhile to instead of coming up with encoding schemes like this to avoid hashing for some really common things, by hard-coding them? e.g., I could imagine |
The most common chars are As for common symbols, we have the static symbol list at the top of Stashing the lock is difficult. It's extremely easy to call |
Yes, I meant that when we run into these symbols in code (i.e., during parsing), we're currently hashing them to figure out the index to put into the Symbol, rather than e.g. directly comparing against a (much smaller) subset. Maybe there's no wins to be had here -- though it would entirely bypass the lock, it'd be a fairly high constant cost.
Yeah, I thought this would be the case. I'm actually fairly surprised that taking the lock is expensive -- I'd expect it to be much cheaper than it seems to be from what you've said (and we've seen when removing it). In today's rustc, it's essentially just incrementing/decrementing a single integer. I'm personally feeling like we could land this but I'm pretty ambivalent. It is a fairly significant increase in complexity, I think, for not too much in the way of gains. |
Accessing the table requires taking a lock (really just a |
One argument in favour of landing is that the complexity is well-contained. Users of |
Thinking some more, it feels like we should land this, but make sure to try and do so with an implementation that is somewhat abstract in the sense that I'd like it to be not too hard to experiment with other inline storage methods (e.g., my ideas). I'm happy to review that work. |
I wish we didn't have to do this, it's just too damn ugly. |
If the majority of overhead comes from TLS lookup, then providing a direct access to the string interner (and other globals) through |
cc @rust-lang/compiler on evaluating the performance vs complexity tradeoff. |
Not sure if this idea is nonsense: We could have a separate (non-tls) table for the pre-interned symbols and look up the string of all the pre-interned symbols in a different table. This would require an additional comparison operation per lookup and only give us a potential speed up for preinterned symbols, but it may give us the same speedups as this PR since most of the short symbols we're looking at are builtin ones I'd think? |
I strongly recommend doing some profiling to confirm this kind of assumption :) |
Zulip discussion from the previous meeting - https://rust-lang.zulipchat.com/#narrow/stream/238009-t-compiler.2Fmeetings/topic/.5Bweekly.20meeting.5D.202020-07-30.20.2354818/near/205485071. |
@nnethercote In the meantime could you move all the |
@petrochenkov: thank you for doing #75309. I had though that a proc macro could probably improve the keyword categorization, but I didn't have the gumption to do it myself. I am on PTO for the next two weeks so I won't get to this until after that. I'm still ambivalent about this, particularly because of my uncertainty in #74554 (comment). If I had to choose between eliminating |
2adb229
to
6d3c11a
Compare
☔ The latest upstream changes (presumably #74862) made this pull request unmergeable. Please resolve the merge conflicts. |
The check in rustdoc using it is artificial and not helpful.
6d3c11a
to
23888ae
Compare
@petrochenkov: I have incorporated your commits from #75309. @bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 23888ae with merge 946d2e6f55e986fc0427f659bb269031b255369e... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 946d2e6f55e986fc0427f659bb269031b255369e with parent 022e1fe, future comparison URL. |
Finished benchmarking try commit (946d2e6f55e986fc0427f659bb269031b255369e): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
The latest perf results are much worse -- small wins on |
Perhaps due to #75813 "Lazy decoding of DefPathTable from crate metadata (non-incremental case)"? The previous result (#74554 (comment)) showed improvements mostly for small crates, it means that they could be related to decoding symbols from metadata, but #75813 skips that metadata decoding entirely. |
Sounds plausible. I think we can close this. |
Remove `SymbolStr` This was originally proposed in rust-lang#74554 (comment). As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences. Best reviewed one commit at a time. r? `@oli-obk`
Remove `SymbolStr` This was originally proposed in rust-lang/rust#74554 (comment). As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences. Best reviewed one commit at a time. r? `@oli-obk`
Remove `SymbolStr` This was originally proposed in rust-lang/rust#74554 (comment). As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences. Best reviewed one commit at a time. r? `@oli-obk`
Remove `SymbolStr` This was originally proposed in rust-lang/rust#74554 (comment). As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences. Best reviewed one commit at a time. r? `@oli-obk`
The idea here is to encode symbols that are 4 bytes or shorter directly in the
u32
, and only use the hash table for longer symbols. Avoiding the hash table accesses should speed things up.r? @ghost