Separate fn signature from fn body in HIR #35078

nikomatsakis · 2016-07-27T20:59:41Z

Currently, getting access to a HIR item gives you access to both the function signature and its body. This makes the incremental compilation graph more coarse than it should be. To fix this, the best thing would be to separate things out so that the item is just the external signature, and you have to separately load (e.g., from the HIR map) the body for a fn item. This way we could record that you actually accessed the fn body and track that separately.

eddyb · 2016-07-31T00:06:57Z

The body is a block and those are in the HIR map already, so I think the only missing thing is keeping an ID instead of a reference, and looking up that ID every time.
Not being able to use references without having implicit dependencies does seem like a bit of a limitation, which will impact the future design of HIR.

flodiebold · 2016-10-16T17:20:37Z

I'm currently looking into this, and I think I've got the general idea, but I've got a few questions to make sure I'm on the right track:

I've found three places where functions or methods are defined in the HIR: TraitItem_::MethodTraitItem, ImplItemKind::Method, and Item_::ItemFn. The plan would basically be to replace the P<Block> in each of these by a NodeId, right? (By the way, is there a reason ImplItemKind isn't named ImplItem_?)
We'll need some other place to store the top-level blocks. Would the Crate be right for this, like for items?
Or maybe removing the P<Block> from the item is the wrong approach. Could we instead just make it private, and make sure that accesses are recorded that way?
Should I wrap the NodeId of the block in a BlockId, similar to the ItemId?
How should visitors be handled? I could add a method visit_fn_block similar to visit_nested_item, though I'm not sure if it should by default visit the block or not.
Anything else I'm completely missing?

nikomatsakis · 2016-10-17T22:13:59Z

@flodiebold great! Sorry I haven't had a chance to respond yet. Let me start by giving a few quick answers, I only have a few minutes just now.

The plan would basically be to replace the P in each of these by a NodeId, right?

I would expect it to be a different kind of id (e.g., MethodBodyId or DefId), but a NodeId might also be fine.

By the way, is there a reason ImplItemKind isn't named ImplItem_?

The _ is just an older naming convention. I prefer Kind.

We'll need some other place to store the top-level blocks. Would the Crate be right for this, like for items?

Yes, I think so.

Could we instead just make it private, and make sure that accesses are recorded that way?

Well, we could, but we don't use privacy much for LIR; I would be inclined to move it into separate array myself.

Should I wrap the NodeId of the block in a BlockId, similar to the ItemId?

See above. I think it probably makes sense, just so that BodyId (or whatver) can be a small integer that directly indexes a vector.

How should visitors be handled? I could add a method visit_fn_block similar to visit_nested_item, though I'm not sure if it should by default visit the block or not.

Yeah, that's a tricky one. I'd say add the visitor and -- to start -- visit the block by default. That should preserve the existing behavior, and you can test and make sure everything is working. Then we can debate changing the default; it'd require going through the visitors a bit to see what they do.

eddyb · 2016-10-17T22:28:41Z

If I may make a suggestion, there is a more useful abstraction here than function bodies, that I keep needing time and time again, that is, definitions that have associated MIR (just "bodies"?).
This includes all function definitions, closure expressions, array lengths in types and enum discriminants.
It's a bit annoying to visit them, and you end up with ugliness like this.

nikomatsakis · 2016-10-18T19:47:14Z

@eddyb interesting. It seems like all the same benefits for incr comp would apply to most of those cases too. I wonder if the right type for these "bodies" would be expressions and not blocks, so that they can be used everywhere.

eddyb · 2016-10-20T16:31:28Z

So I got stuck in a place where I really really needed an unified way to refer to those bodies, so I tried moving all of them to hir::Expr. Which went well (yay deduplication) except for middle::region.

For some reason, if I have |...| expr instead of |...| {expr} (which is what the parser always did), there's a missing Misc CodeExtent (for expr.id?) so rustc_mir::build::scope ICEs sometimes.

It's not clear to me as to why this would be the case, since expr.id is marked as a terminating scope, and then resolve_expr calls visitor.new_node_extent_with_dtor(expr.id) which for terminating scopes it creates a DestructionScope - ooooh. I think I got it? The MIR builder is looking for Misc instead.

I might be able to fix this now, although I'm still not sure what the correct behavior should be, really.
For the record: at the time of the writing, the commit I'm talking about is eddyb@cc18ef0.

nikomatsakis · 2016-10-20T18:00:55Z

@eddyb I think I expect there to be both a Misc and a DestructionScope, one nested inside the other

eddyb · 2016-10-20T20:11:21Z

Update: it was these few lines which are both no longer necessary and require a Block for each function (but closures now don't have one so it kept going up reaching the outer function).

nikomatsakis · 2016-10-21T20:28:27Z

@flodiebold (so, I want to clarify something. I think that we should consider factoring out statics, constants, closure bodies, etc as a second step -- it makes sense to just focus on fn bodies first, we can refactor as a follow-up if we choose.)

eddyb · 2016-10-21T20:53:50Z

This is more complex than I expected, but I hope to have one or two commits soon that can be cherry-picked.

eddyb · 2016-10-23T03:18:35Z

@flodiebold Here you go: eddyb@203ccaf - should apply relatively cleanly on master.

I wanted to add a visit_body method to intravisit::Visitor but I have to do other things first.

flodiebold · 2016-10-23T10:20:14Z

Great! I'm taking a look.

flodiebold · 2016-10-23T19:26:43Z

@eddyb Ok, I've cherry-picked the commit without big conflicts. But I'm now getting two spurious 'unreachable statement' warnings in libsyntax when compiling stage 2, failing the build :( The lines the compiler says are unreachable are here and here. check-stage1-rpass works, and I haven't managed to make a small reproduction so far. So this is rather hard to debug for me, and I hope you have an idea... It seems the problem also occurs in your branch, so it's not just a merge problem.

eddyb · 2016-10-23T19:30:32Z

@flodiebold Oops, I never tried building stage2. The crater run I just started will probably catch it.

Yeah, those cases are almost unreachable, looks like match and if "diverges" checks are a bit wrong.
EDIT: to be clear, I'm working on it, hoping to have something in the next hour.

eddyb · 2016-10-23T20:55:55Z

@flodiebold Alright, that was only a leak from "then" to "else". Fixed commit is eddyb@75269d2.

EDIT: still broken (this time in rustc_borrowck), investigating... (although you shouldn't be affected)
EDIT2: Another problem in rustc_driver (this time a correct warning, yay). Fix is in eddyb@d23057e.

petrochenkov · 2016-10-24T10:43:59Z

Can patterns from fn parameters be moved into the function body somehow? (Probably in HIR?)
They are implementation details and not parts of a function interface and dependent code shouldn't recompile when they change.
Something like:

fn f((a, b): (u8, u8)) {
    println!("{} {}", a, b);
}

=>

fn f(__arg0: (u8, u8)) {
    let (a, b) = __arg0;
    println!("{} {}", a, b);
}

eddyb · 2016-10-24T10:50:35Z

@petrochenkov Yeah, the argument types and the patterns can be split into two lists.
In fact, I needed this! Since the types are part of the signature and thus "outside" the body, while the patterns are "inside" (e.g. per-item type tables include their types, they're inferred as part of the body, etc.).

eddyb · 2016-10-26T00:19:05Z

@flodiebold @nikomatsakis Good news! You can just rebase on top of #37412 now.

@eddyb

Prohibit patterns in trait methods without bodies They are not properly type checked ```rust trait Tr { fn f(&a: u8); // <- This compiles } ``` , mostly rejected by the parser already and generally don't make much sense. This PR is kind of a missing part of rust-lang#35015. Needs crater run. cc rust-lang#35078 (comment) rust-lang#35015 rust-lang/rfcs#1685 rust-lang#35203 r? @eddyb

@eddyb

Prohibit patterns in trait methods without bodies They are not properly type checked ```rust trait Tr { fn f(&a: u8); // <- This compiles } ``` , mostly rejected by the parser already and generally don't make much sense. This PR is kind of a missing part of #35015. Given the [statistics from crater](#37378 (comment)), the effect of this PR is mostly equivalent to improving `unused_mut` lint. cc #35078 (comment) #35015 rust-lang/rfcs#1685 #35203 r? @eddyb

flodiebold · 2016-10-29T19:04:44Z

Ok, I'm making some progress. Two snags I ran into:

I couldn't make visitors visit bodies by default, because in the visitor trait I don't have access to the crate or HIR map. So I left out the default completely, meaning I had to add a function to every single Visitor impl. Maybe I could make skipping the block the default once I've got it working.
Since closures are often treated the same way as functions (e.g. with visit_fn, or FnLike), I had to separate their body as well, which has caused some additional complications (I think there are quite a few visitors which wouldn't need to go into nested items, but do need to go into nested closures). I'm not sure that was the best idea, but I couldn't think of a better solution.

Currently, my main problem is the following: Compilation of librand fails because I think the compiler is trying to const-eval a function from libcore, for which it needs the HIR of the body, and that's not saved in the crate metadata anymore. So I'll probably need to add the function bodies somewhere in the crate metadata, but I'm not yet sure how to approach that.

My current state is here.

eddyb · 2016-10-29T19:15:41Z

@flodiebold The only thing that's needed is the body, you might be able to change astencode to only work on bodies.

nikomatsakis · 2016-11-01T19:43:30Z

@flodiebold

Ok, I'm making some progress.

Great to hear! In the meantime, I've been working on #36349, and I'm making good progress there I think. Some of the things you are describing overlap.

I couldn't make visitors visit bodies by default, because in the visitor trait I don't have access to the crate or HIR map.

Yeah, this is sort of annoying. =) In principle we should be able to use specialization here to make this a touch more ergonomic, though some key features are missing.

So I did something to these traits too =). What I did is to separate out items and impl-items into the category of "Item-like" (I should probably factor our trait-items too). Then the visit_all_items method on Crate becomes visit_all_item_likes. It no longer takes a normal intravisit::Visitor but instead takes an ItemLikeVisitor, which has just two methods (visit_item, visit_impl_item). The intention is to group visitors into three categories:

Shallow visitors: these just want to look at each item (and sometimes impl-items).
- They should implement the ItemLikeVisitor trait directly. This trait has no defaults, so as we add new methods in the future, we can revisit those cases and see if changes are needed.
Deep visitors: these want to walk over the entire tree, but they are not particularly sensitive to the order that they do it in.
- They can implement intravisit::Visitor as today and invoke as_deep_visitor(), which is a new method that converts a Visitor into an ItemLikeVisitor.
- the important point is that in the future we may break new things into "item-like", and that will change the visit order for these deep-nested walkers, but that should not be a problem.
Nested visitors: these want to walk the tree, and they want to walk items in lexical order (e.g., they want to walk nested items in the context of their parents, impl items in the context of the impl etc).
- These should use intravisit::Visitor as today, but should not use visit_all_item_likes; instead they should use visit::walk_krate (basically as today).
- One annoying thing is that, as you said, there is no "simple" way to implement all the "visit nested" routines.

To address this very last bullet point, and this seems relevant to what you were writing, I was contemplating having a fn hir_map(&self) -> &'v HirMap sort of thing in the trait. I originally thought it would return an Option and default to None, and then you would override if you wanted to enable visiting nested things. But now I am thinking maybe it should be orthogonal, so that the default for visiting nested function bodies can be different (if we want it to be).

Then again, it'd be better to avoid walking nested things like that by default, just to make the code more intentional.

My preference in this refactoring, btw, would be to have the nested case implement some other trait (NestedVisitor) that is required to provide a HirMap but otherwise makes nesting invisible. Or something.

michaelwoerister · 2016-11-01T21:55:56Z

cc me

flodiebold · 2016-11-06T17:15:54Z

Still working on this. I've got something which passes check-stage1 and the incremental tests (with the FIXMEs in change_private_impl_method fixed 😄 ). There's still quite a few things to clean up and refactor now before it's ready though, in particular the visitors (I think the NestedVisitor thing is a good idea) and the handling of inlined items (I solved that in a very hacky way).

nikomatsakis · 2016-11-08T20:36:40Z

@flodiebold nice! I hope to have my PR up today -- but I'm cleaning up the last bit, which is taking longer than I wanted. If you want to check out my current branch, it's here

@Manishearth

[6/n] rustc: transition HIR function bodies from Block to Expr. _This is part of a series ([prev](rust-lang#37408) | [next](rust-lang#37676)) of patches designed to rework rustc into an out-of-order on-demand pipeline model for both better feature support (e.g. [MIR-based](https://github.com/solson/miri) early constant evaluation) and incremental execution of compiler passes (e.g. type-checking), with beneficial consequences to IDE support as well. If any motivation is unclear, please ask for additional PR description clarifications or code comments._ <hr> The main change here is that functions and closures both use `Expr` instead of `Block` for their bodies. For closures this actually allows a honest representation of brace-less closure bodies, e.g. `|x| x + 1` is now distinguishable from `|x| { x + 1 }`, therefore this PR is `[syntax-breaking]` (cc @Manishearth). Using `Expr` allows more logic to be shared between constant bodies and function bodies, with some small such changes already part of this PR, and eventually easing rust-lang#35078 and per-body type tables. Incidentally, there used to be some corners cut here and there and as such I had to (re)write divergence tracking for type-checking so that it is capable of understanding basic structured control-flow: ``` rust fn a(x: bool) -> i32 { // match also works (as long as all arms diverge) if x { panic!("true") } else { return 1; } 0 // "unreachable expression" after this PR } ``` And since liveness' "not all control paths return a value" moved to type-checking we can have nice things: ``` rust // before & after: fn b() -> i32 { 0; } // help: consider removing this semicolon // only after this PR fn c() -> i32 { { 0; } } // help: consider removing this semicolon fn d() { let x: i32 = { 0; }; } // help: consider removing this semicolon fn e() { f({ 0; }); } // help: consider removing this semicolon ```

nikomatsakis · 2016-11-17T17:43:32Z

@flodiebold hey, how's it going? I think this is one of the major blockers.

flodiebold · 2016-11-17T18:14:45Z

I've had very little time over the last week, but this weekend I have time and hope to make good progress, maybe even get a pull request ready :) I assume your changes will also be merged until then, so I can rebase on top of that.

nikomatsakis · 2016-11-17T18:17:27Z

@flodiebold yes, hopefully! I'm tracking down an annoying windows problem, but now that I know what it is (paths too long) we can hopefully resolve it and get the PR landed.

nikomatsakis · 2016-11-17T18:17:35Z

ok, in that case I'll leave off.

@flodiebold

Separate impl items from the parent impl This change separates impl item bodies out of the impl itself. This gives incremental more resolution. In so doing, it refactors how the visitors work, and cleans up a bit of the collect/check logic (mostly by moving things out of collect that didn't really belong there, because they were just checking conditions). However, this is not as effective as I expected, for a kind of frustrating reason. In particular, when invoking `foo.bar()` you still wind up with dependencies on private items. The problem is that the method resolution code scans that list for methods with the name `bar` -- and this winds up touching *all* the methods, even private ones. I can imagine two obvious ways to fix this: - separating fn bodies from fn sigs (#35078, currently being pursued by @flodiebold) - a more aggressive model of incremental that @michaelwoerister has been advocating, in which we hash the intermediate results (e.g., the outputs of collect) so that we can see that the intermediate result hasn't changed, even if a particular impl item has changed. So all in all I'm not quite sure whether to land this or not. =) It still seems like it has to be a win in some cases, but not with the test cases we have just now. I can try to gin up some test cases, but I'm not sure if they will be totally realistic. On the other hand, some of the early refactorings to the visitor trait seem worthwhile to me regardless. cc #36349 -- well, this is basically a fix for that issue, I guess r? @michaelwoerister NB: Based atop of @eddyb's PR #37402; don't land until that lands.

flodiebold · 2016-11-19T13:22:45Z

@nikomatsakis So, I'm rebasing and integrating my visitor changes with yours. Obviously, I also want to use nested_visit_map; but there are a lot of visitors which want to visit bodies, but not nested items, so just implementing nested_visit_map would change their behavior. The solution I'm trying out currently is changing nested_visit_map to return an Option<(&Map, NestedVisitMode)>, where NestedVisitMode can currently be either OnlyBodies or All. I think that will make it easy to extend if there are more nested things later. What do you think?

nikomatsakis · 2016-11-21T17:35:18Z

@flodiebold so I think it is plausible that visitors should visit the item body by default. At least i'd be fine with landing a change that did this by default. After all, if the visitor does not try to recurse, it won't register as a read, and I think it will be the common case -- most checkers want to check the signature+body together.

nikomatsakis · 2016-11-21T17:36:57Z

@flodiebold well, in any case, I'd be happy to land a PR where existing behavior is preserved and then we can tinker later, if we see a need.

flodiebold · 2016-11-21T18:15:35Z

Ok, I guess I'll just make the PR.

@nikomatsakis

Separate function bodies from their signatures in HIR Also give them their own dep map node. I'm still unhappy with the handling of inlined items (1452edc1), but maybe you have a suggestion how to improve it. Fixes #35078. r? @nikomatsakis

nikomatsakis added the A-incr-comp Area: Incremental compilation label Jul 27, 2016

nikomatsakis mentioned this issue Oct 12, 2016

[goal] modifying private method in impl should not require users of public methods to recompile #37121

Closed

petrochenkov mentioned this issue Oct 24, 2016

Prohibit patterns in trait methods without bodies #37378

Merged

eddyb mentioned this issue Oct 26, 2016

[6/n] rustc: transition HIR function bodies from Block to Expr. #37412

Merged

nikomatsakis mentioned this issue Nov 9, 2016

Separate impl items from the parent impl #37660

Merged

nikomatsakis added this to the Incremental compilation beta milestone Nov 17, 2016

flodiebold mentioned this issue Nov 21, 2016

Separate function bodies from their signatures in HIR #37918

Merged

bors closed this as completed in #37918 Nov 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate fn signature from fn body in HIR #35078

Separate fn signature from fn body in HIR #35078

nikomatsakis commented Jul 27, 2016 •

edited by michaelwoerister

Loading

eddyb commented Jul 31, 2016 •

edited

Loading

flodiebold commented Oct 16, 2016

nikomatsakis commented Oct 17, 2016

eddyb commented Oct 17, 2016

nikomatsakis commented Oct 18, 2016

eddyb commented Oct 20, 2016

nikomatsakis commented Oct 20, 2016

eddyb commented Oct 20, 2016

nikomatsakis commented Oct 21, 2016

eddyb commented Oct 21, 2016

eddyb commented Oct 23, 2016

flodiebold commented Oct 23, 2016

flodiebold commented Oct 23, 2016

eddyb commented Oct 23, 2016 •

edited

Loading

eddyb commented Oct 23, 2016 •

edited

Loading

petrochenkov commented Oct 24, 2016

eddyb commented Oct 24, 2016

eddyb commented Oct 26, 2016

flodiebold commented Oct 29, 2016

eddyb commented Oct 29, 2016

nikomatsakis commented Nov 1, 2016

michaelwoerister commented Nov 1, 2016

flodiebold commented Nov 6, 2016

nikomatsakis commented Nov 8, 2016

nikomatsakis commented Nov 17, 2016

flodiebold commented Nov 17, 2016

nikomatsakis commented Nov 17, 2016

nikomatsakis commented Nov 17, 2016

flodiebold commented Nov 19, 2016 •

edited

Loading

nikomatsakis commented Nov 21, 2016

nikomatsakis commented Nov 21, 2016

flodiebold commented Nov 21, 2016

Separate fn signature from fn body in HIR #35078

Separate fn signature from fn body in HIR #35078

Comments

nikomatsakis commented Jul 27, 2016 • edited by michaelwoerister Loading

eddyb commented Jul 31, 2016 • edited Loading

flodiebold commented Oct 16, 2016

nikomatsakis commented Oct 17, 2016

eddyb commented Oct 17, 2016

nikomatsakis commented Oct 18, 2016

eddyb commented Oct 20, 2016

nikomatsakis commented Oct 20, 2016

eddyb commented Oct 20, 2016

nikomatsakis commented Oct 21, 2016

eddyb commented Oct 21, 2016

eddyb commented Oct 23, 2016

flodiebold commented Oct 23, 2016

flodiebold commented Oct 23, 2016

eddyb commented Oct 23, 2016 • edited Loading

eddyb commented Oct 23, 2016 • edited Loading

petrochenkov commented Oct 24, 2016

eddyb commented Oct 24, 2016

eddyb commented Oct 26, 2016

flodiebold commented Oct 29, 2016

eddyb commented Oct 29, 2016

nikomatsakis commented Nov 1, 2016

michaelwoerister commented Nov 1, 2016

flodiebold commented Nov 6, 2016

nikomatsakis commented Nov 8, 2016

nikomatsakis commented Nov 17, 2016

flodiebold commented Nov 17, 2016

nikomatsakis commented Nov 17, 2016

nikomatsakis commented Nov 17, 2016

flodiebold commented Nov 19, 2016 • edited Loading

nikomatsakis commented Nov 21, 2016

nikomatsakis commented Nov 21, 2016

flodiebold commented Nov 21, 2016

nikomatsakis commented Jul 27, 2016 •

edited by michaelwoerister

Loading

eddyb commented Jul 31, 2016 •

edited

Loading

eddyb commented Oct 23, 2016 •

edited

Loading

eddyb commented Oct 23, 2016 •

edited

Loading

flodiebold commented Nov 19, 2016 •

edited

Loading