-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
three-point error messages in borrowck #45988
Comments
cc @spastorino @Nashenas88 -- one of you might be the right ones to take this on. |
I'd love to work on this. Whenever I get back to the visualizer work, this would be extremely useful to be familiar with. |
One question about the UI here: If the borrow is moved between many variables and data structures, would it be useful to explain that movement? One of my biggest frustrations with some existing errors are when two seemingly unrelated types cause lifetime issues amongst each other. In those cases the relationship isn't immediately clear, and the errors is very vague. We are much more verbose when it comes to explaining unsatisfied trait bounds. Here's an example taken from Faraday, reproduced below:
If we use a modified version of one of the existing tests as an example: #[derive(Debug)]
struct Wrap<'p> { p: &'p mut i32 }
impl<'p> Drop for Wrap<'p> {
fn drop(&mut self) {
*self.p += 1;
}
}
#[derive(Debug)]
struct Foo<'p> { a: String, b: Wrap<'p> }
fn main() {
let mut x = 0;
let wrap = Wrap { p: &mut x };
let s = String::from("str");
let foo = Foo { a: s, b: wrap };
x = 1; //~ ERROR
println!("{:?}", foo);
} I can imagine a more helpful error message saying something along the lines of:
I think that if we're going to write another inference pass to handle this, it might not be so challenging to extend to this given that we already have all the data necessary. Though, I can also see this rabbit holing into tons of special cases 😅 . Would this be beyond the scope of the RFC? |
@Nashenas88 I've wondered the same thing. I did not propose tracing the full path, because I didn't want to overwhelm the user, but I imagine it might be useful, yes. Maybe a good idea would be to design the system so that we have the information, then we can separately tweak when we think it makes sense to display it. |
Some notes from a conversation with @spastorino: https://gist.github.com/nikomatsakis/ed89304174dd607d887bc10a35e30ab7 |
OK, now that the basic causal work has landed, let me try to explain the next steps. Let's start with this example: #![allow(warnings)]
fn read(_: &i32) { }
fn main() {
let mut x = 22;
let p = &x;
x += 1; // point A
read(p); // point B
} Currently this will give an error like:
as you can see, the error already identifies the point A, which is the point where the assignment occurs, and it identifies the borrow of
There is a function in the source that has this purpose, called Now, the region checker has been augmented to track causal information, and in fact We're now almost ready to print out the extra information we wanted -- but not quite! We know that The dfs.rs module in region inference may be of interest, since it preforms a simple depth-first search -- I'm not saying you want to re-use the code per se, more that it could be used as a model. Though it occurs to me now that there also exists various iteration code in The other thing you will need is some code to decide when a particular statement uses a local variable. The liveness analysis has this visitor that it uses for that. We may want to extract that code so it can be readily re-used; and in fact we probably want to respect the distinction between "regular" use and drops. If the cause is There are some other root causes too, of course, but for now we can just ignore those and not try to add extra explanatory information. For example, if the cause is |
Just in case so we don't step on each other, I'm working on this thing and it's close to be finished. |
The NLL RFC goes to some lengths to discuss how to phrase NLL error messages. In particular, we want to adopt the so-called "three point" style:
This is going to require a bit of work. Right now, we easily have two of those points: the borrow checker has identified a borrow, and it knows where it occurred (that's the "borrow occurs here" point). It has also identified an access, and it knows where that occurs (that's the "write occurs here" point). What it does not know is the "borrow is later used here" point. That's a bit harder for us to find.
Right now, what we have readily available is just the region of the borrow. This indicates all the points in the control-flow graph where the borrow is still "live" (potentially in use), but it does not indicate why the borrow is potentially in use. It's the why (i.e., what use caused us to consider this point as part of the region) that we are interested in here.
Still, we have all the information we theoretically need. We have the region inference context, which contains the full set of constraints that we used to find the regions, and we have the MIR itself. (This is intentional: in the lexical checker, we only kept the final results of inference, and not the constraints that led to those results, which sometimes hobbled our ability to issue errors we wanted.)
However, it's still a bit of an open question how best to make use of those constraints to find the point in question. We could probably do some kind of graph-search heuristics that would lead us to the right point most of the time. I've also been considering another idea: we could re-run inference, but this time use different data structures. Rather than treating regions as simple sets of points, we would consider a region to be a set of
(P, U)
tuples. HereP
is the liveness point, butU
is some use of a variable (i.e., probably itself a pair(Pu, X)
of a point of use and a variable name). The idea is that the regionR
contains the pointP
because of the useU
. We'd have to extend liveness to track not only which variables are live, but what useU
makes them live at that point.The rest of inference is effectively unchanged, except that we track these pairs, so that whenever we add a point to a region, we also track the variable that made that point live (ultimately, every point in any region stems from some live use). This would be more expensive -- more data! -- but we're only doing it in the case of error. We could also do a more targeted form of inference, limiting ourselves to the regions that are in some way connected to the borrow region.
If we did that, then ultimately the borrow region would be inferred to contain some number of
(P, U)
tuples whereP
is the point of access and the useU
is the third point we want to highlight.What is appealing about this is that it is a very precise, very general analysis. What is unappealing is that it requires reproducing a lot of code. But we might be able to factor out most of it with generics so that it's not so much re-use.
If we don't do this, I'm not sure quite sure what else we would do. I can imagine some heuristics search -- basically making a graph that shows which regions are connected to what and then searching for a use of some variable X that contains a region P that is connected in some way to the borrow region -- but I haven't thought of anything else that yields the right results without essentially reproducing the analysis in some way. That heuristic search might or might not be good enough; I'd really prefer not to give wrong results to the user.
The text was updated successfully, but these errors were encountered: