-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std: Switch from libbacktrace to gimli #73441
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
|
I think the process to follow here is to file a major change proposal (https://github.com/rust-lang/compiler-team/blob/master/.github/ISSUE_TEMPLATE/major_change.md) on the compiler-team repository, and then we can go from there. I suspect that the body of that proposal can basically just link to this PR's description. I would personally be on board with this change, I think I agree with the proposal for how to integrate into std via the submodule -- I don't like it, but I don't think we have a better approach today -- and, more broadly, that we should be looking to migrate std away from libbacktrace. One thing that I don't see addressed in the PR description is the licensing of these libraries we're adding -- are they all MIT or Apache 2.0, same as the Rust repository, or do we need to do some legwork in that respect? Also, it would be good to get some sense of what kind of review you think would be appropriate here -- e.g., should we be trying to find some person or persons to try and review at least the unsafe code we're bringing in (if any) or perhaps more broadly the various libraries we're adding in? I presume that backtrace isn't going to be relying on any unstable code, right, just using stabilized APIs in std? Otherwise a submodule may be a bit painful with bootstrap bumping and such (though nothing we can't deal with if necessary). |
Excellent point on licenses! The license are:
The For review, that's also a very good question. Here's the various big "chunks"
Also, to make sure that everything's within context, we added libbacktrace to the standard library with practically zero review. The |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
I suppose that's gimli's |
Switch over to the new Gimli-based backtrace symbolicator. Gimli is a pure-Rust reimplementation of libbacktrace that will soon be the default backtrace implementation in the Rust standard library [0]. Switching to it now means we can help shake out problems, like [1], before it ships in the standard library, where fixing problems will require a full six week release cycle. This switch has the side effect of fixing backtraces on macOS, which seem to be broken at the moment because the backtrace crate is picking libbacktrace over coresymbolication. There isn't a good way of fixing that because Cargo doesn't support per-platform features [2]; using neither libbacktrace or coresymbolicator is a kind of stupid but effective way of sidestepping the problem. [0]: rust-lang/rust#73441 [1]: rust-lang/backtrace-rs#342 [2]: rust-lang/cargo#1197
@cuviper hm yes indeed, you wouldn't happen to know off-hand how to fix that would you? |
addr2line = { version = "0.12.0", optional = true, default-features = false } | ||
rustc-demangle = { version = "0.1.4", optional = true } | ||
miniz_oxide = { version = "0.3.7", optional = true } | ||
[dependencies.object] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[dependencies.object] | |
[dependencies.object] |
Sorry, no -- I guess you should consult some rustdoc folks? |
Ok I made some rustc/rustdoc changes to fix the issue about suggestions and broken links pointing to internal crates libstd depends on. |
// Don't issue suggestions for unstable traits since they're | ||
// unlikely to be implementable anyway | ||
.filter(|info| match self.tcx.lookup_stability(info.def_id) { | ||
Some(attr) => attr.level.is_stable(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this keep suggesting unstable traits on nightly?
I guess we also want some legal attention on miniz_oxide as well? We'll want both MIT and Apache 2.0 I think for standard library deps -- I don't think it matters as much for the compiler? I've posted a topic on the licensing stream on Zulip as well about this: https://rust-lang.zulipchat.com/#narrow/stream/231349-t-core.2Flicensing/topic/backtrace.20crate.20inclusion.20in.20std Also, to be up front with some of the expectations on my end as to when I can get around to reviewing, I can't be sure, but I'll try to allocate some time to do so for the things you noted as needing review. I'll make a note on the MCP as well.
|
I'm thinking that it would be good to perhaps land parts of this separately as well -- say, the unstable trait changes etc I don't know a lot about so probably not the best reviewer -- and then we can follow up with a PR that mostly just adds in the submodule and doesn't have ~any compiler changes. |
Thanks for following up on the license bits! I'm not really sure how to go about that and am happy to follow whatever recommendation is necessary. For splitting off changes, I don't mind doing so. I'm not sure if it's too beneficial though. This is a large-ish PR since it changes how libstd uses the |
To be clear this is because I feel pretty comfortable reviewing the std/backtrace bits but pretty much not at all so for the typeck and rustdoc, and splitting them out makes that review easier to make happen naturally I think (rather than trying to cc folks into this PR and identify what they need to review). |
Ok no worries! I'll look to splitting that out tomorrow. |
☔ The latest upstream changes (presumably #73746) made this pull request unmergeable. Please resolve the merge conflicts. |
Currently dependency crates of the standard library can sometimes leak into error messages such as when traits to import are suggested. Additionally they can leak into documentation such as in the list of "all traits implemented by `u32`". The dependencies of the standard library, however, are intended to be private. The dependencies of the standard library can't actually be stabl-y imported nor is the documentation that relevant since you can't import them on stable either. This commit updates both the compiler and rustdoc to ignore unstable traits in these two scenarios. Specifically the suggestion for traits to import ignore unstable traits, and similarly the list of traits implemented by a type excludes unstable traits. This commit is extracted from rust-lang#73441 where the addition of some new dependencies to the standard library was showed to leak into various error messages and documentation. The intention here is to go ahead and land these changes ahead of that since it will likely take some time to land.
I'll leave the changes here for now to ensure CI stays green, but I've also extracted them out for independent review to #73771 |
44529ec
to
145abef
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
☔ The latest upstream changes (presumably #72978) made this pull request unmergeable. Please resolve the merge conflicts. |
Thanks for the heads up, it's definitely not intended that this regresses anything. As mentioned a bit on the Zulip thread I think perf data has filled in and the precise regression range for just this PR is at https://perf.rust-lang.org/compare.html?start=7d31ffc1ac9e9ea356e896e63307168a64501b9d&end=1fa54ad9680cc82e7301f8ed4e9b7402dfd6ce0e&stat=instructions%3Au. Still bad, but wanted to make sure it didn't accidentally include other PRs. |
@alexcrichton: Unfortunately a regression this large is unacceptable. Can you file a PR to revert the change? Thank you. |
This was also a max-rss regression, with lots of benchmarks in the 5-10% range, including plenty of real-world crates. |
Ok I'm looking at the perf data this morning trying to figure out what's best to do next. I want to be clear that I'm very wary of simply reverting this change based solely on the numbers. I don't think that makes sense to do unless there's actually a feasible route to re-landing this, but that does imply further understanding of the numbers as well. From what I can tell, the regressions fall into these categories:
Looking into the details of regressions the lion's share (e.g. almost all of it) of time regressions (especially those >5ms) are entirely within the My understanding of this is that the linker is slower because it has more work to do. #74560 shows that removing debuginfo from these crates can reduce the amount of work the linker has to do, but it's still got to resolve all the symbols and probably gc a good deal of object code out of the final object. Additionally the compiler simply fundamentally has more work to do because there's more crates in play that it needs to load and inspect. There are presumably a number of loops in the compiler that query all crates loaded, and those loops are all a little bit longer now. And finally memory regressions are basically guaranteed because we're mmaping rlibs and metadata into the process, so since we're mmap'ing more things then memory usage is basically guaranteed to go up. With my understanding of all this I don't really see a bug here. These regressions are all defined by how rustc works with loading external crates and what it means to add more dependencies to a crate. I don't think there's any way to remove these regressions unless we simply never land this feature. Put that way, I would personally prefer to not revert this. There's definitely a lot of noise in the measurements, and if we'd like to get more precise measurements I think it makes sense to post a PR that's a revert and run perf on that (but not for the intention of landing). Perf gains would translate to perf losses if we were to revert all this and re-land it. In general though if the attitude is "one of those numbers is too negative, back out full-stop" then that, I believe, provides very little means to land this. I do not have the time to investigate how we can rearchitect rustc to lessen the impact of adding dependencies to a crate, and I'm not sure whether that's even possible. It's extremely hard to justify spending time on that when the practical impact of this PR is that crate compilations linking to libstd are about 5ms slower and if you run the linker it's about 30-40ms slower. This PR is also not simply a random PR out of the weeds, but rather one that's been intended since basically 1.0 to fix a number of longstanding bugs:
I understand how the regression numbers in this PR seem alarming. They're very surprising to me because I wasn't expecting them as well. I would like to, however, push back against automatic reactions that we should back this out because of these numbers. I think it's very worthwhile to analyze and understand what happened, but I think it's harmful to require that rustc's perf monotonically improves for the rest of time at the cost of feature development. |
I think It's fine to regress this as long as we track progress on its root cause instead of reverting a symptom like this PR. It seems like a problem to me if merging crates into one crate manually will lead to speed ups and we should look into addressing this where possible. This kind of work will benefit all crates, and with growing numbers of dependencies it seems like the effects could really be felt at some point. |
The comment above lays most of the blame on linking.
Here's my view.
The comments above suggest that a backout is a draconian response. I disagree. Perhaps this is because I come from Firefox where the landing process is different and backouts are frequent and no big deal. To me, it just says "hey, this didn't work out like we expected, and we should undo things, take a minute, and try again". Finally, I'm not saying something extreme like "no perf regressions are ever allowed", and I don't think anyone else is saying that. But the above circumstances indicate there is a great deal of surprise and uncertainty surrounding the effects of this PR. I recommend that the PR be backed out, and that the follow-ups be added to this PR so clear perf measurements can be made. If there are still lots of double-digit regressions, then perhaps it's worth a discussion at a compiler meeting. |
This was not the case. |
You are right. Apologies, there were numerous regressions this week and I mixed this one up with some of the others. Nonetheless, my basic argument still holds, which is that a backout is a reasonable thing to do here. |
#74613 is the reversion. |
Could we benchmark gimili with LLD to see if linker regression could be related to some inefficiency in BFD? |
My point I was trying to explain is that if this is backed out because of performance it has no recourse for landing again without seemingly major architectural changes in rustc. The performance will not change the next time around it's landed. If the purpose is to get reliable performance numbers backing out is not required, as I explained we can benchmark the reversion. |
I think to some extent we'll probably want to just land it regardless of the performance, but it's easier to have that conversation (amongst compiler team or so) when we're talking about the regression without needing to invert numbers and such. |
Intentionality is key. To me, there is a large difference between "whoops this regressed, let's try to improve it afterwards if we can" and "we know in advance that this will regress by x%, we have taken measures to minimize this, and we think it's worth it because of $Y, do we have agreement?". The latter gets agreement up front on a known performance regression amount. |
… r=Mark-Simulacrum std: Switch from libbacktrace to gimli (take 2) This is the second attempt to land rust-lang#73441 after being reverted in rust-lang#74613. Will be gathering precise perf numbers here in this take. Closes rust-lang#71060
This commit is a proof-of-concept for switching the standard library's
backtrace symbolication mechanism on most platforms from libbacktrace to
gimli. The standard library's support for
RUST_BACKTRACE=1
requiresin-process parsing of object files and DWARF debug information to
interpret it and print the filename/line number of stack frames as part
of a backtrace.
Historically this support in the standard library has come from a
library called "libbacktrace". The libbacktrace library seems to have
been extracted from gcc at some point and is written in C. We've had a
lot of issues with libbacktrace over time, unfortunately, though. The
library does not appear to be actively maintained since we've had
patches sit for months-to-years without comments. We have discovered a
good number of soundness issues with the library itself, both when
parsing valid DWARF as well as invalid DWARF. This is enough of an issue
that the libs team has previously decided that we cannot feed untrusted
inputs to libbacktrace. This also doesn't take into account the
portability of libbacktrace which has been difficult to manage and
maintain over time. While possible there are lots of exceptions and it's
the main C dependency of the standard library right now.
For years it's been the desire to switch over to a Rust-based solution
for symbolicating backtraces. It's been assumed that we'll be using the
Gimli family of crates for this purpose, which are targeted at safely
and efficiently parsing DWARF debug information. I've been working
recently to shore up the Gimli support in the
backtrace
crate. As of afew weeks ago the
backtrace
crate, by default, uses Gimli when loadedfrom crates.io. This transition has gone well enough that I figured it
was time to start talking seriously about this change to the standard
library.
This commit is a preview of what's probably the best way to integrate
the
backtrace
crate into the standard library with the Gimli featureturned on. While today it's used as a crates.io dependency, this commit
switches the
backtrace
crate to a submodule of this repository whichwill need to be updated manually. This is not done lightly, but is
thought to be the best solution. The primary reason for this is that the
backtrace
crate needs to do some pretty nontrivial filesysteminteractions to locate debug information. Working without
std::fs
isnot an option, and while it might be possible to do some sort of
trait-based solution when prototyped it was found to be too unergonomic.
Using a submodule allows the
backtrace
crate to build as a submoduleof the
std
crate itself, enabling it to usestd::fs
and such.Otherwise this adds new dependencies to the standard library. This step
requires extra attention because this means that these crates are now
going to be included with all Rust programs by default. It's important
to note, however, that we're already shipping libbacktrace with all Rust
programs by default and it has a bunch of C code implementing all of
this internally anyway, so we're basically already switching
already-shipping functionality to Rust from C.
object
- this crate is used to parse object file headers andcontents. Very low-level support is used from this crate and almost
all of it is disabled. Largely we're just using struct definitions as
well as convenience methods internally to read bytes and such.
addr2line
- this is the main meat of the implementation forsymbolication. This crate depends on
gimli
for DWARF parsing andthen provides interfaces needed by the
backtrace
crate to turn anaddress into a filename / line number. This crate is actually pretty
small (fits in a single file almost!) and mirrors most of what
dwarf.c
does for libbacktrace.miniz_oxide
- the libbacktrace crate transparently handlescompressed debug information which is compressed with zlib. This crate
is used to decompress compressed debug sections.
gimli
- not actually used directly, but a dependency ofaddr2line
.adler32
- not used directly either, but a dependency ofminiz_oxide
.The goal of this change is to improve the safety of backtrace
symbolication in the standard library, especially in the face of
possibly malformed DWARF debug information. Even to this day we're still
seeing segfaults in libbacktrace which could possibly become security
vulnerabilities. This change should almost entirely eliminate this
possibility whilc also paving the way forward to adding more features
like split debug information.
Some references for those interested are:
need to carry - Update libbacktrace #50955
Switching to Rust will not make us immune to all of these issues. The
crashes are expected to go away, but correctness and performance may
still have bugs arise. The gimli and
backtrace
crates, however, areactively maintained unlike libbacktrace, so this should enable us to at
least efficiently apply fixes as situations come up.
I want to note that my purpose for creating a PR here is to start a conversation about this. I think that all the various pieces are in place that this is compelling enough that I think this transition should be talked about seriously. There are a number of items which still need to be addressed before actually merging this PR, however:
gimli
needs to be published to crates.ioaddr2line
needs a publishminiz_oxide
needs a publishgimli
crate's traits for implementingbacktrace
crate's branch changes need to be merged to the master branch (Prepare for this crate to go into libstd backtrace-rs#349)libbacktrace
on some platforms needs to be audited to see if we should support more strategies in the gimli implementation - Gimli doesn't support FreeBSD backtrace-rs#325, Gimli doesn't support OpenBSD backtrace-rs#326, Gimli doesn't support iOS backtrace-rs#350, Gimli doesn't support Android backtrace-rs#351Most of the merging/publishing I'm not actively pushing on right now. It's a bit wonky for crates to support libstd so I'm holding off on pulling the trigger everywhere until there's a bit more discussion about how to go through with this. Namely rust-lang/backtrace-rs#349 I'm going to hold off merging until we decide to go through with the submodule strategy.
In any case this is a pretty major change, so I suspect that the compiler team is likely going to be interested in this. I don't mean to force changes by dumping a bunch of code by any means. Integration of external crates into the standard library is so difficult I wanted to have a proof-of-concept to review while talking about whether to do this at all (hence the PR), but I'm more than happy to follow any processes needed to merge this. I must admit though that I'm not entirely sure myself at this time what the process would be to decide to merge this, so I'm hoping others can help me figure that out!