Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid redundant translation of items during monomorphization #16059

Merged
merged 1 commit into from
Jul 31, 2014
Merged

avoid redundant translation of items during monomorphization #16059

merged 1 commit into from
Jul 31, 2014

Conversation

spernsteiner
Copy link
Contributor

Currently, each time a function is monomorphized, all items within that function are translated. This is unnecessary work because the inner items already get translated when the function declaration is visited by trans_item. This patch adds a flag to the FunctionContext to prevent translation of items during monomorphization.

@brson
Copy link
Contributor

brson commented Jul 28, 2014

Neat find! Are you aware of any instances where this was causing problems?

@spernsteiner
Copy link
Contributor Author

It doesn't cause any visible errors, but it was making things difficult for my separate compilation branch. I thought that trans_item would be called only once on each item and relied on that property in order to get symbol linkage right, then it turned out not to be true due to this bug.

It also has the strange (but not particularly harmful) side effect of duplicating the bodies of inner functions on each monomorphization. So if f<T>() contains g(), and f is monomorphized three times, then the IR for g will contain three additional copies of each basic block (all dead, because only one of the entry_block copies can be the actual entry block of the function). These extra copies get deleted by the optimizer (even on -O0), so at worst it might increase memory usage a bit.

@huonw
Copy link
Member

huonw commented Jul 29, 2014

I wonder if this is noticable on a memory graph of the compiler, because we certainly use a lot of Vecs and HashMaps there? (cc @cmr)

@dotdash
Copy link
Contributor

dotdash commented Jul 29, 2014

Does this fix #7349?

@spernsteiner
Copy link
Contributor Author

I wonder if this is noticable on a memory graph of the compiler, because we certainly use a lot of Vecs and HashMaps there?

To be clear, I don't think this has much effect on the run-time performance of the compiled code. The compiler wastes some time and memory generating duplicate copies of various things, but the copies all get thrown away at some point. (Except maybe on Windows, where we don't use -ffunction-sections/-fdata-sections for some reason.)

Does this fix #7349?

Yeah, that looks like the same issue. With this change, the test case from your last comment there generates only one copy of the function body.

@@ -1235,7 +1235,8 @@ pub fn new_fn_ctxt<'a>(ccx: &'a CrateContext,
output_type: ty::t,
param_substs: &'a param_substs,
sp: Option<Span>,
block_arena: &'a TypedArena<Block<'a>>)
block_arena: &'a TypedArena<Block<'a>>,
ignore_items: bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use an enum here than a bool

@nrc
Copy link
Member

nrc commented Jul 30, 2014

Is it possible to add a test to make sure this doesn't regress in the future?

@spernsteiner
Copy link
Contributor Author

I changed the boolean ignore_items flag to an enum. I also added a regression test for #7349, which this PR fixes.

fn main() {
outer::<int>();
outer::<uint>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably need a license to get past make tidy

@alexcrichton
Copy link
Member

r=me, just some small things here and there.

@spernsteiner
Copy link
Contributor Author

OK, I've fixed the issues with the test.

bors added a commit that referenced this pull request Jul 30, 2014
Currently, each time a function is monomorphized, all items within that function are translated.  This is unnecessary work because the inner items already get translated when the function declaration is visited by `trans_item`.  This patch adds a flag to the `FunctionContext` to prevent translation of items during monomorphization.
@bors bors closed this Jul 31, 2014
@bors bors merged commit f97f65f into rust-lang:master Jul 31, 2014
@vadimcn
Copy link
Contributor

vadimcn commented Jul 31, 2014

@epdtry, you've mentioned "separate compilation" above. Is this about compiling each source file into a separate object file, or about something else?

@spernsteiner
Copy link
Contributor Author

@vadimcn, I'm working on allowing rustc to split a crate into several compilation units, so that it can run LLVM optimization and codegen passes on several units in parallel. The goal is to speed up compilation at the cost of a small performance hit for the generated code. It's intended for development builds, where edit-compile-test speed is particularly important.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 1, 2014

@epdtry, Cool! I'd love to see that happen!
I am curious - how were you going to handle monomorphized functions? Emit them as comdats?
Would you consider writing an RFC? I imagine this would be a popular feature among Rust contributors!

@spernsteiner
Copy link
Contributor Author

@vadimcn,

I am curious - how were you going to handle monomorphized functions? Emit them as comdats?

I initially had it using LLVM's linkonce_odr linkage type for monomorphizations, which I believe uses comdats internally. It didn't work out very well - it hurt compile times somewhat (LLVM has to optimize the same code in several compilation units) and link times tremendously (ld doesn't seem to appreciate being given a file with 20,000 comdats).

Nowadays, the first monomorphization (for a particular function + actual type parameters) gets defined with external linkage, and subsequent monomorphizations in other compilation units simply produce external declarations (no code, just an undefined symbol). The exception is for #[inline] functions, where the subsequent monomorphizations produce a copy of the function code with available_externally linkage. This lets LLVM do inlining using the provided definition, but produces no code in the object file (only an undefined symbol, like for external declarations). (Other #[inline] items are handled similarly - they produce an external definition in the defining compilation unit, and available_externally definitions in others.)

Would you consider writing an RFC?

I don't think it would be worth it at this point, since the project is nearly done. (The code review(s) should be up some time next week.) And anyway, I'm not sure an RFC would be useful in this case, since the only user-visible change is a pair of new codegen options in rustc.

@huonw
Copy link
Member

huonw commented Aug 2, 2014

Nowadays, the first monomorphization (for a particular function + actual type parameters) gets defined with external linkage, and subsequent monomorphizations in other compilation units simply produce external declarations (no code, just an undefined symbol).

Isn't this a backwards-compatibility concern for dynamic linkage, e.g. it is not opt-in and a crate removes its use of a certain monomorphisation that downstream crates were using? (I know we don't have a good story about dynamic linkage anyway.)

Also, #14527 seems related?

bors added a commit that referenced this pull request Aug 9, 2014
…chton

Extend the changes from #16059 to the new generic foreign functions introduced by #15831.
bors added a commit that referenced this pull request Aug 13, 2014
This code produces an ICE:

```rust
#![crate_type = "rlib"]
fn main() {
    if true { return }
    // remaining code is unreachable
    match () {
        () => { static MAGIC: uint = 0; }
    }
}
```
([playpen](http://is.gd/iwOISB))

The error is "encode_symbol: id not found 18", where 18 is the `NodeId` of the declaration of `MAGIC`.  The problem is that `rustc` tries to emit metadata for `MAGIC`, but some of the information is missing because `MAGIC` never gets translated by `trans_item` - the entire body of the `match` gets skipped because the `match` itself is unreachable.

This branch simplifies the handling of inner items by always processing them using the `trans_item` visitor, instead of sometimes using the visitor and sometimes waiting until `trans_stmt` encounters the item.  This fixes the ICE by making the translation of the item no longer depend on the declaration being reachable code.  This branch also reverts #16059 and #16359, since the new change to item translation fixes the same problems as those but is simpler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants