-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Implicit caller location (third try to the unwrap/expect line info problem) #2091
Conversation
I think the macros should always resolve to the special lang item, and the lang items should also work in non-inlined functions (but you'd get the location of the lang item itself, not the caller). And maybe the lang items could be annotated such that they can be used from within the macros but not written manually. |
Hmm. Turning const FILE: &str = file!();
const LINE: u32 = line!();
macro_rules! my_file { () => (FILE) }
macro_rules! my_line { () => (LINE) }
fn main() {
println!("{}", concat!(file!(), ":", line!()));
//^ Fine, prints src/main.rs:6
println!("{}", concat!(my_file!(), ":", my_line!()));
//~^ ERROR: expected a literal
} |
Uh oh. That might be unintended. At least the docs say it expands to an expression of type |
@jethrogb Actually expanding to a literal would be far easier than a special const item. We just add a new However, doing so would make |
Is that optimization worth it? It only means the value is not duplicated between crates. |
text/0000-inline-semantic.md
Outdated
## Caller location | ||
|
||
The core crate provides three magic constants `core::caller::{FILE, LINE, COLUMN}` which resolves to | ||
the caller's location one the function is inlined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have multiple layers of #[inline(semantic)]
, this might end up being unclear. How about a compile-time backtrace getting built up with all of that info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oli-obk The panic handler will need to be updated to accept an inline stack instead of just the location...
fn panic_impl(fmt: fmt::Arguments, inline_stack: &[(&'static str, u32, u32)]) -> !;
I just want to say that this is one of the most well-written RFCs I've ever seen. Amazing job @kennytm ! |
text/0000-inline-semantic.md
Outdated
|
||
## Why do we use semantic-inlining | ||
|
||
If you are learning Rust alongside other languages, you may wonder why Rust obtain the caller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/obtain/obtains
text/0000-inline-semantic.md
Outdated
|
||
Rust allows developers to use the `#[inline]` attribute to *hint* the optimizer that a function | ||
should be inlined. However, if we want the precise caller location, a hint is not enough, it needs | ||
to be a requirement. Therefore, the `#[inline(semantic)]` attribute is introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the description here is slightly off - we already have #[inline(always)]
, and #[inline(semantic)]
behaves fundamentally differently than traditional inlining. This feature doesn't even necessarily need to be thought about as an inlining at all, really - you could also phrase it in terms of parameterizing the function over its invocation locations, for example.
text/0000-inline-semantic.md
Outdated
assert_eq!(get_caller_loc(), (file!(), line!())); | ||
``` | ||
|
||
There is also a `caller_location!()` macro to return all three information as a single tuple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be a macro, right? Could it be a semantically inlined function instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfackler Yes it can be. core::caller::location()
, or core::???::Location::caller()
if Location
is moved into libcore.
text/0000-inline-semantic.md
Outdated
`#[inline(with_caller_location)]`? | ||
* Use a different attribute, instead of piggybacking on `#[inline]`? | ||
* `***_at_source_location` is too long? | ||
* Should we move `std::panic::Location` into `core`, and not use a 3-tuple to represent the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact this is what RFC #2070 is suggesting as well
text/0000-inline-semantic.md
Outdated
- [Rationale and alternatives](#rationale-and-alternatives) | ||
- [Rationale](#rationale) | ||
- [Alternatives](#alternatives) | ||
- [🚲 Name of everything 🚲](#🚲-name-of-everything-🚲) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link is broken on Github, the anchor is #-name-of-everything-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ariasuni Thanks. This is a bug in the Markdown TOC generator (alanwalk/markdown-toc#31).
Thanks @kennytm for writing such a well written RFC! It is definitely important to fix this issue of the unwrap function. First on a matter of coordination. @japaric has opened RFC #2070 recently which concerns enabling Semantic inlining vs traditional inliningI want to emphasize what @sfackler has pointed out above: Due to the point above, the If the location info lang itemsAs the lang items get treated specially by the compiler, so they are always more than just what you call them. You still need to find some category in the existing language where the a feature fits in really well, which you then use to expose it to the remainder of the language. I think the category chosen right now, constants, is not really suiting. For me, an important property of constants is that they are a never changing value, everywhere the same. There are mutable statics, but this is a different thing obviously. You'll have to add an exception to constant folding to specifically avoid them, and you need to define dummy values at the declaration site that might be mistaken by the reader for actual values. And, you have to introduce a new concept of "magic" constants to explain them. A much greater way of categorizing the lang items would be intrinsic functions marked with Forwards compatibilityWhen I added column info to panic printing, I was pleasantly surprised that panic info printing was forwards compatible to a great degree; I didn't have to avoid breaking any stable API. This property should be preserved by the RFC IMO. Possible future changes include e.g. include more information from the span, like e.g. the ending line and colum, or crate names, or similar. Of course you can argue how useful these additions are, but I think regardless from one or another being not very useful, keeping forwards compat is important! The panic hook mechanism has the opaque I haven't found any other places where this RFC is incompatible with future extensions, but if there are any, it would be great to get rid of them. |
I certainly agree this is annoying problem, and one worth solving. It seems to me that the big decision to be made is indeed whether to assume that debuginfo is not a viable path -- I believe that other languages have solved this by having the ability to annotate functions as being "implementation details", and hence these functions get omitted or overlooked in a backtrace (i.e., by the backtrace machinery). But I can't find any citations for this second approach. In a way, the two are similar. The idea is to label functions that users probably don't want to see. I think one could argue that implementing this via inlining is an 'implementation detail', right? The important bit is that the various new macros in question will work give the position info from the caller, right? (Or are there other things I am overlooking where it is crucial that the function is inlined at the MIR level?) |
It is important that the function is inlined in MIR instead of waiting for LLVM because the location info is separate for each time a function gets inlined. We want to expand the location info to a literal at compile time and not read it from debuginfo at runtime so that this works with builds whose debuginfo is stripped. IMO we should still offer an option to strip even this info (I think right now its possible to do that via custom panic_fmt lang items, not sure though), but there should be a mode where main debuginfo is being stripped, but the panic location is still being printed. |
Inlining is so painful to debug, I'd like to avoid it here if at all possible. I also don't really agree with your rationale that relying on debug info is not an option. In my experience, the location information from a panic message is rarely useful (at all) in trying to figure out what went wrong - you need at least a backtrace for that. I just don't see the value in trying to provide this information even when there's no debug info. It's certainly of no immediate use to end users, but I can imagine it being relevant for bug reports. However as I said, a bug report without a backtrace is rarely useful anyways. From this perspective, I would suggest a very different approach: Rely on debug info entirely. Then, the solution to this problem can be as simple as just always printing more than a single stack frame (say 3) by default (having to re-run from the beginning with Release builds are of course an issue but I believe there's a much better solution for that: Build with debug symbols but save them externally, in their own file. This also means that bug reports from end users are now much more useful overall as the backtrace can be decoded using the symbol file (even core dumps should work, right?). Anyways, that's just what I'd like to see from my personal experience. |
Can inlining be replaced with a special ABI |
I like this idea of MIR inlining a lot. As for another use case: It would be nice if the callee could be able to verify, at compile-time, that the caller is eligible, given some policy or otherwise. I'd also like to see an implementation of this panicking business where this source location data could be stripped from the binary. Points of possible panic are rather prevalent. |
@petrochenkov It's not as bad as it is involving in the backend. Unless you mean it's all in the Rust compiler, which could make it a MIR pass, which, uhh, would be vaguely equivalent to inlining. |
Yeah, I meant normal calling conventions (in low level sense), just an extra argument implicitly added during translation. |
Thanks for the feedback 🙂 Long post below. @est31 on Forwards compatibilityThe current PoC did not change the how panic works besides relaxing the lifetime requirement of @sfackler @est31 on Traditional inlining vs Semantic inlining, and Location info lang itemThis RFC itself did not propose changing #[inline(semantic)]
fn boom() {
panic!("kaboom"); // line 3
}
fn main() {
boom(); // line 6
} with the current implementation, the panic will still point to line 3. So this is still somewhat like traditional inlining. To entirely blur the line between caller/callee, we need to change Assuming we don't care about panic's optimization, the only hazard is that,
The static/const should trigger an error something similar to E0401. There should also be I'm open to keeping @nikomatsakis @main-- on DebuginfoDisclaimer: I've never used a debugger on rust generated program for serious debugging because (1) printf-debugging is much more accurate and easier (2) lldb doesn't seem to work properly[a] (3) the only time I've used lldb is to obtain a more correct backtrace, since RUST_BACKTRACE=1 does not work at all on macOS (rust-lang/rust#24346). So I am certainly biased here 😆. The biggest reason debuginfo is not reliable because it is often absent. And it is absent because it is not generated by default, is huge (>100% size of the executable) and leaks sensitive source code info for propriety programs. Look, even $ find ~/.rustup -name '*.dSYM'
$
There is also the issue that RUST_BACKTRACE=1 is not enable by default, partly because it is slow (rust-lang/rust#42295), and also because sometimes your system just doesn't have And I just feel that a systems language with thin runtime shouldn't require a runtime solution to get the caller info.
@nikomatsakis on Omitting backtrace (a.k.a. #1744)Yes implementing this via inlining is an implementation detail. And yes the important bit is the various lang items will work. But I disagree that the idea is to label functions that users don't want to see. In fact, if possible I want
This could be done by cloning the function's MIR + rewrite + redirect the caller, instead of inlining them. That is more aligned to @sfackler's comment that Omitting part of backtrace inside an IDE is not a new idea, Xcode supported this very long time ago, even with dynamic granularity setting. But I'm not aware of any languages besides Swift that allow programs to annotate functions as unimportant themselves, at least not in Python or Java. The problem isn't that the backtrace is too verbose, but it is not precise enough at the right place due to (traditional) inlining, or that the backtrace does not include the line number and you used more than one @petrochenkov on Special ABIIt won't work because |
I have used gdb on Rust a few times and I agree that it could use some improvement. The seemingly random jumps you're seeing are probably due to the fact that the debug symbols emitted by rustc tend to associate almost every instruction with the original statement that caused it to be generated. This is rather problematic as optimizations shuffle things around quite a bit. But I consider this a bug that should be fixed one day (C/C++ compilers have a decent workaround). I'm well aware that debug info is rarely shipped in releases. I even believe storing debug info externally (like Microsoft have been doing for ages with their pdb files) is very unconventional in the Unix world today (FWIW Ubuntu do provide symbol files for some of their packages). I wonder why? Is there any big drawback I'm not aware of?
Yes, this is certainly an issue today. My suggestion was not "just use debuginfo", it was "make sure debug symbols exist (throughout the ecosystem), then just use that". Most notably:
👍 It should! And again, my core argument here is basically that in a world where rustc generates symbol files for every release build by default, not having these symbols always means that showing line info is undesirable. |
First of all, thank you for raising this issue and providing an avenue of discussion; it seems that whenever I write tests I have to have at least this one instance of I think that to solve the optimization issue it could be useful to have a special flag/setting which strips column information from debug information and source locations (making them 0, or even removing them altogether). This is something that gcc and clang have, with With regard to your proposal, I am not a fan of inlining. It seems to me inlining is limited and may cause issues:
The latter can be worked around by having a lint advising users to create a private function with the bulk of the implementation and then using the current public function as a forwarding shell (which can then be inlined without bloat). However it does mean shifting the burden onto the users. As a result, much as did @petrochenkov, I too wonder whether a different ABI/magic parameter would not be a better solution. As a strawman proposal, I'll go with the "magic parameter" approach:
The essence here is that if we are to be magical, I don't see why we cannot have simili default parameters. Calling a
The latter is fully decidable at compile-time. There is one question: how should specialization/implementation of a trait method work. There are two alternatives as far as I can see:
I am not sure which is best; I would recommend starting with the first one to accumulate experience and check whether this is really an issue to start with. |
@matthieu-m I have thought of magic parameter and tried to implement that 3 times but it involves too many parts of compilers to make it work 😄 (AST/HIR to split a function into two, and MIR to rewrite the call, probably something more which I forgotten). Essentially #[ghost]
fn x(args: T) -> R { ... }
// =>
#[ghost(redirect="def_id of x$real")]
fn x(args: T) -> R {
x$real(args, &Location::current())
}
fn x$real(args: T, location: &Location) -> R {
...
} then, after type-checking, any direct call to
Some questions for this:
After thinking a bit of these, I find that the effect is very similar to inlining in terms of what I want to solve ( |
ELF-based systems like Linux and BSD do embed DWARF section in the binary. But macOS store them externally in a |
They traditionally do so in debug builds. And it makes sense there. But moving them to a separate file is perfectly possible:
This is what I would suggest for release builds on ELF systems. |
@kennytm thanks for your reply. Could you also reply to these suggestions of mine, you seem to have missed them:
|
@pornel |
I think that the language feature requested in this RFC is both too invasive and too specific. I feel like it's a kludge that, if accepted, will haunt the Rust spec for eternity. I'd rather see a general solution that allows you to include code in the caller's context that surround the called function (AOP-style but inline), or more work on guaranteeing not just the type of an argument but also its value. Being able to guarantee values statically (or semi-statically) at the type level is a lot more general and powerful, and something that we would eventually like to tackle. Think of Ada-like value ranges or an effect system encoded in the types of objects and functions. I'd like to see this proposal postponed until evaluated against long-term goals of the Rust language -- i.e. evolved static guarantees and so on. |
So, we decided to accept this RFC, but in the meantime there was some more feedback. Let me try to summarize the concerns: I think these concerns can be summarized as "This RFC doesn't do enough to justify its complexity/cost":
I'm nominating for @rust-lang/lang team discussion to decide if we feel like we ought to back up or can proceed towards implementation. |
I think we should continue. We don't always want full backtraces, and just knowing the caller location is useful. |
I agree that just knowing the caller is extremely useful. On an embedded device I currently work on, there is no working unwinder right now, and an Option::unwrap() panic might mean... anything. And it is actually not trivial to get a working unwinder, e.g. you need a linker script that properly preserves DWARF sections, and even less trivial to debug an unwinder when for some reason it misbehaves. Not to mention that the unwinder, and especially the DWARF tables, are heavy space-wise. I don't think that everyone who works with embedded Rust must pay the memory cost of having an unwinder, not to mention the mental cost of getting one to work. Also, there's no pure-Rust unwinders right now, so if one wants to do without the C compiler for the target device, backtraces are just impossible. |
In the lang team meeting, the overall feeling was that we should stay the course: an improvement here is badly needed, and this RFC continues to be a highly plausible approach. The concerns that have been raised since FCP closed, while legitimate, are best assessed when we have a working implementation to learn from and iterate on. Thanks again @kennytm for the RFC! |
This corrects RFC rust-lang#2091’s file name.
The link to the rendered RFC in OP is broken by the way. |
@KasMA1990 Thanks, I've fixed it now. |
…l, r=davidtwco Implement -Z location-detail flag This PR implements the `-Z location-detail` flag as described in rust-lang/rfcs#2091 . `-Z location-detail=val` controls what location details are tracked when using `caller_location`. This allows users to control what location details are printed as part of panic messages, by allowing them to exclude any combination of filenames, line numbers, and column numbers. This option is intended to provide users with a way to mitigate the size impact of `#[track_caller]`. Some measurements of the savings of this approach on an embedded binary can be found here: rust-lang#70579 (comment) . Closes rust-lang#70580 (unless people want to leave that open as a place for discussion of further improvements). This is my first real PR to rust, so any help correcting mistakes / understanding side effects / improving my tests is appreciated :) I have one question: RFC 2091 specified this as a debugging option (I think that is what -Z implies?). Does that mean this can never be stabilized without a separate MCP? If so, do I need to submit an MCP now, or is the initial RFC specifying this option sufficient for this to be merged as is, and then an MCP would be needed for eventual stabilization?
…l, r=davidtwco Implement -Z location-detail flag This PR implements the `-Z location-detail` flag as described in rust-lang/rfcs#2091 . `-Z location-detail=val` controls what location details are tracked when using `caller_location`. This allows users to control what location details are printed as part of panic messages, by allowing them to exclude any combination of filenames, line numbers, and column numbers. This option is intended to provide users with a way to mitigate the size impact of `#[track_caller]`. Some measurements of the savings of this approach on an embedded binary can be found here: rust-lang#70579 (comment) . Closes rust-lang#70580 (unless people want to leave that open as a place for discussion of further improvements). This is my first real PR to rust, so any help correcting mistakes / understanding side effects / improving my tests is appreciated :) I have one question: RFC 2091 specified this as a debugging option (I think that is what -Z implies?). Does that mean this can never be stabilized without a separate MCP? If so, do I need to submit an MCP now, or is the initial RFC specifying this option sufficient for this to be merged as is, and then an MCP would be needed for eventual stabilization?
…l, r=davidtwco Implement -Z location-detail flag This PR implements the `-Z location-detail` flag as described in rust-lang/rfcs#2091 . `-Z location-detail=val` controls what location details are tracked when using `caller_location`. This allows users to control what location details are printed as part of panic messages, by allowing them to exclude any combination of filenames, line numbers, and column numbers. This option is intended to provide users with a way to mitigate the size impact of `#[track_caller]`. Some measurements of the savings of this approach on an embedded binary can be found here: rust-lang#70579 (comment) . Closes rust-lang#70580 (unless people want to leave that open as a place for discussion of further improvements). This is my first real PR to rust, so any help correcting mistakes / understanding side effects / improving my tests is appreciated :) I have one question: RFC 2091 specified this as a debugging option (I think that is what -Z implies?). Does that mean this can never be stabilized without a separate MCP? If so, do I need to submit an MCP now, or is the initial RFC specifying this option sufficient for this to be merged as is, and then an MCP would be needed for eventual stabilization?
…=davidtwco `-Z location-detail`: provide option to disable all location details As reported [here](rust-lang#89920 (comment)), when I first implemented the `-Z location-detail` flag there was a bug, where passing an empty list was not correctly supported, and instead rejected by the compiler. This PR fixes that such that passing an empty list results in no location details being tracked, as originally specified in rust-lang/rfcs#2091 . This PR also adds a test case to verify that this option continues to work as intended.
Current (Implicit caller location)
Summary
Enable accurate caller location reporting during panic in
{Option, Result}::{unwrap, expect}
withthe following changes:
#[blame_caller]
function attribute, which guarantees a function has access to thecaller information.
caller_location()
(safe wrapper:Location::caller()
) to retrievethe caller's source location.
Legacy (Semantic inlining):