Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1100% increase (!) in x86_64-unknown-linux release compile times #104842

Closed
maxburke opened this issue Nov 24, 2022 · 10 comments
Closed

1100% increase (!) in x86_64-unknown-linux release compile times #104842

maxburke opened this issue Nov 24, 2022 · 10 comments
Labels
C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@maxburke
Copy link

maxburke commented Nov 24, 2022

Hi,

We've noticed a significant performance regression in compiling our top-level crate in release. It seems to have started as of nightly-2022-09-23:

$ touch src/main.rs && cargo build --release
   Compiling ulv2 v0.1.0 (/home/max/src/ul/services/ulv2)
    Finished release [optimized] target(s) in 17m 10s

With nightly-2022-09-22 the builds are significantly faster:

$ touch src/main.rs && cargo build --release
   Compiling ulv2 v0.1.0 (/builds/UrbanLogiq/ul/services/ulv2)
    Finished release [optimized] target(s) in 1m 22s

In trying to identify where this issue began we see that it's happening as far forward as 2022-11-14. Unfortunately other issues with toolchains have hindered us testing any further forward.

Compilation tests were done on an AMD 5950x (Ubuntu 22.04) with the following config.toml options:

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "target-feature=+avx2,+fma", "-C", "link-arg=-fuse-ld=lld"]

We've seen a similar ratio in our CI builds (Azure F8s instances building in Ubuntu 20.04), but Apple M1 (aarch64 darwin) builds are unaffected.

@maxburke maxburke added the C-bug Category: This is a bug. label Nov 24, 2022
@Mark-Simulacrum
Copy link
Member

Bisecting to a particular commit with https://github.com/rust-lang/cargo-bisect-rustc would help, and if you can provide a GitHub repository or similar that would also help (so we can reproduce).

@Mark-Simulacrum Mark-Simulacrum added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. regression-untriaged Untriaged performance or correctness regression. labels Nov 24, 2022
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Nov 24, 2022
@Noratrieb
Copy link
Member

You could also profile the build with the -Ztime-passes or -Zself-profile rustc flags to find the cause of the slowness (which I assume will be in some specific place).

@Noratrieb
Copy link
Member

Also, simple profiling with something like perf could help (if you could build rustc from source with debug symbols that could be even more useful).

@maxburke
Copy link
Author

@Nilstrieb I've created a repository here with the profiling data obtained with -Ztime-passes and -Zself-profile: https://github.com/maxburke/rustc-104842

There are two directories, one for 2022-09-22 which doesn't exhibit the performance issue and 2022-09-23 which does

@Mark-Simulacrum Unfortunately I'm not able to share the repository, but I will try bisecting to the affecting commit.

@maxburke
Copy link
Author

@Mark-Simulacrum Bisected the performance regression to change 8ab71ab59fd17a1c51d23b68eced935b92431b70

@Mark-Simulacrum
Copy link
Member

Cc #100980 @compiler-errors

Note that this means this regression will hit stable with 1.66, so rather soon, unless we end up with a backportable fix.

@Mark-Simulacrum Mark-Simulacrum added regression-from-stable-to-beta Performance or correctness regression from stable to beta. and removed regression-untriaged Untriaged performance or correctness regression. labels Nov 25, 2022
@compiler-errors
Copy link
Member

compiler-errors commented Nov 25, 2022

This is reverted and backported on beta via #103509 -- if the current beta contains this PR and it's still slow, then I can investigate, but seems to be dealt with.

@Mark-Simulacrum
Copy link
Member

Oh, hm, we should have reverted that PR on beta and nightly now. It would be great to see if we can verify it doesn't reproduce on beta perhaps? (Testing a more recent nightly would also be great).

@Mark-Simulacrum Mark-Simulacrum removed the regression-from-stable-to-beta Performance or correctness regression from stable to beta. label Nov 25, 2022
@compiler-errors
Copy link
Member

compiler-errors commented Nov 25, 2022

@maxburke do you mind checking on the latest nightly?

edit: oh "Unfortunately other issues with toolchains have hindered us testing any further forward."

Can you check that your beta tool chain is up to date and re-test?

@maxburke
Copy link
Author

@compiler-errors I solved the issues we were getting with moving forward and after trying the latest nightly (2022-11-24), performance seems to be back to normal.

@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Nov 29, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 29, 2023
…th-late-bound-vars-again, r=jackh726

Normalize opaques with late-bound vars again

We have a hack in the compiler where if an opaque has escaping late-bound vars, we skip revealing it even though we *could* reveal it from a technical perspective. First of all, this is weird, since we really should be revealing all opaques in `Reveal::All` mode. Second of all, it causes subtle bugs (linked below).

I attempted to fix this in rust-lang#100980, which was unfortunately reverted due to perf regressions on codebases that used really deeply nested futures in some interesting ways. The worst of which was rust-lang#103423, which caused the project to hang on build. Another one was rust-lang#104842, which was just a slow-down, but not a hang. I took some time afterwards to investigate how to rework `normalize_erasing_regions` to take advantage of better caching, but that effort kinda fizzled out (rust-lang#104133).

However, recently, I was made aware of more bugs whose root cause is not revealing opaques during codegen. That made me want to fix this again -- in the process, interestingly, I took the the minimized example from rust-lang#103423 (comment), and it doesn't seem to hang any more...

Thinking about this harder, there have been some changes to the way we lower and typecheck async futures that may have reduced the pathologically large number of outlives obligations (see description of rust-lang#103423) that we were encountering when normalizing opaques with bound vars the last time around:
* rust-lang#104321 (lower `async { .. }` directly as a generator that implements `Future`, removing the `from_generator` shim)
* rust-lang#104833 (removing an `identity_future` fn that was wrapping desugared future generators)

... so given that I can see:
* No significant regression on rust perf bot (rust-lang#107620 (comment))
* No timeouts in crater run I did (rust-lang#107620 (comment), rechecked failing crates in rust-lang#107620 (comment))

... and given that this PR:
* Fixes rust-lang#104601
* Fixes rust-lang#107557
* Fixes rust-lang#109464
* Allows us to remove a `DefiningAnchor::Bubble` from codegen (75a8f68)

I'm inclined to give this another shot at landing this. Best case, it just works -- worst case, we get more examples to study how we need to improve the compiler to make this work.

r? types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants