Only break critical edges where actually needed #33544

dotdash · 2016-05-10T19:15:07Z

Currently, to prepare for MIR trans, we break all critical edges,
although we only actually need to do this for edges originating from a
call that gets translated to an invoke instruction in LLVM.

This has the unfortunate effect of undoing a bunch of the things that
SimplifyCfg has done. A particularly bad case arises when you have a
C-like enum with N variants and a derived PartialEq implementation.

In that case, the match on the (&lhs, &rhs) tuple gets translated into
nested matches with N arms each and a basic block each, resulting in N²
basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but
breaking the critical edges means that we go back to N².

In nickel.rs, there is such an enum with roughly N=800. So we get about
640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to
reduce that to the final "disr_a == disr_b".

So before this patch, we had 2.5M lines of IR with 640K basic blocks,
which took about about 3.6s in LLVM to get optimized and translated.
After this patch, we get about 650K lines with about 1.6K basic blocks
and spent a little less than 0.2s in LLVM.

cc #33111

r? @Aatch

Aatch · 2016-05-10T19:18:08Z

src/librustc_mir/transform/break_cleanup_edges.rs

+                let term_span = term.span;
+                let term_scope = term.scope;
+                let succs = term.successors_mut();
+                // if succs.len() > 1 || (succs.len() > 0 && is_invoke) {


Why not just delete this line?

Because I have feelings for this line... Nah, actually I'm just bad at double checking my stuff.

arielb1 · 2016-05-10T19:49:01Z

src/librustc_mir/transform/break_cleanup_edges.rs

+// Returns true if the terminator is a call that would use an invoke in LLVM.
+fn term_is_invoke(term: &Terminator) -> bool {
+    match term.kind {
+        TerminatorKind::Call { cleanup: Some(_), .. } |


why does this always return false?

Aatch · 2016-05-10T20:35:33Z

r=me if travis passes.

dotdash · 2016-05-11T00:41:40Z

@bors r=Aatch

bors · 2016-05-11T00:41:48Z

📌 Commit 6e04944 has been approved by Aatch

bors · 2016-05-11T10:05:41Z

☔ The latest upstream changes (presumably #33425) made this pull request unmergeable. Please resolve the merge conflicts.

Currently, to prepare for MIR trans, we break _all_ critical edges, although we only actually need to do this for edges originating from a call that gets translated to an invoke instruction in LLVM. This has the unfortunate effect of undoing a bunch of the things that SimplifyCfg has done. A particularly bad case arises when you have a C-like enum with N variants and a derived PartialEq implementation. In that case, the match on the (&lhs, &rhs) tuple gets translated into nested matches with N arms each and a basic block each, resulting in N² basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but breaking the critical edges means that we go back to N². In nickel.rs, there is such an enum with roughly N=800. So we get about 640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to reduce that to the final "disr_a == disr_b". So before this patch, we had 2.5M lines of IR with 640K basic blocks, which took about about 3.6s in LLVM to get optimized and translated. After this patch, we get about 650K lines with about 1.6K basic blocks and spent a little less than 0.2s in LLVM. cc rust-lang#33111

dotdash · 2016-05-11T16:41:08Z

@bors r=Aatch

bors · 2016-05-11T16:41:09Z

📌 Commit 00f6513 has been approved by Aatch

Currently, all switches in MIR are exhausitive, meaning that we can have a lot of arms that all go to the same basic block, the extreme case being an if-let expression which results in just 2 possible cases, be might end up with hundreds of arms for large enums. To improve this situation and give LLVM less code to chew on, we can detect whether there's a pre-dominant target basic block in a switch and then promote this to be the default target, not translating the corresponding arms at all. In combination with rust-lang#33544 this makes unoptimized MIR trans of nickel.rs as fast as using old trans and greatly improves the times for optimized builds, which are only 30-40% slower instead of ~300%. cc rust-lang#33111

@Aatch

… r=Aatch Only break critical edges where actually needed Currently, to prepare for MIR trans, we break _all_ critical edges, although we only actually need to do this for edges originating from a call that gets translated to an invoke instruction in LLVM. This has the unfortunate effect of undoing a bunch of the things that SimplifyCfg has done. A particularly bad case arises when you have a C-like enum with N variants and a derived PartialEq implementation. In that case, the match on the (&lhs, &rhs) tuple gets translated into nested matches with N arms each and a basic block each, resulting in N² basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but breaking the critical edges means that we go back to N². In nickel.rs, there is such an enum with roughly N=800. So we get about 640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to reduce that to the final "disr_a == disr_b". So before this patch, we had 2.5M lines of IR with 640K basic blocks, which took about about 3.6s in LLVM to get optimized and translated. After this patch, we get about 650K lines with about 1.6K basic blocks and spent a little less than 0.2s in LLVM. cc rust-lang#33111 r? @Aatch

[MIR trans] Optimize trans for biased switches Currently, all switches in MIR are exhausitive, meaning that we can have a lot of arms that all go to the same basic block, the extreme case being an if-let expression which results in just 2 possible cases, be might end up with hundreds of arms for large enums. To improve this situation and give LLVM less code to chew on, we can detect whether there's a pre-dominant target basic block in a switch and then promote this to be the default target, not translating the corresponding arms at all. In combination with rust-lang#33544 this makes unoptimized MIR trans of nickel.rs as fast as using old trans and greatly improves the times for optimized builds, which are only 30-40% slower instead of ~300%. cc rust-lang#33111

@Aatch

… r=Aatch Only break critical edges where actually needed Currently, to prepare for MIR trans, we break _all_ critical edges, although we only actually need to do this for edges originating from a call that gets translated to an invoke instruction in LLVM. This has the unfortunate effect of undoing a bunch of the things that SimplifyCfg has done. A particularly bad case arises when you have a C-like enum with N variants and a derived PartialEq implementation. In that case, the match on the (&lhs, &rhs) tuple gets translated into nested matches with N arms each and a basic block each, resulting in N² basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but breaking the critical edges means that we go back to N². In nickel.rs, there is such an enum with roughly N=800. So we get about 640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to reduce that to the final "disr_a == disr_b". So before this patch, we had 2.5M lines of IR with 640K basic blocks, which took about about 3.6s in LLVM to get optimized and translated. After this patch, we get about 650K lines with about 1.6K basic blocks and spent a little less than 0.2s in LLVM. cc rust-lang#33111 r? @Aatch

[MIR trans] Optimize trans for biased switches Currently, all switches in MIR are exhausitive, meaning that we can have a lot of arms that all go to the same basic block, the extreme case being an if-let expression which results in just 2 possible cases, be might end up with hundreds of arms for large enums. To improve this situation and give LLVM less code to chew on, we can detect whether there's a pre-dominant target basic block in a switch and then promote this to be the default target, not translating the corresponding arms at all. In combination with rust-lang#33544 this makes unoptimized MIR trans of nickel.rs as fast as using old trans and greatly improves the times for optimized builds, which are only 30-40% slower instead of ~300%. cc rust-lang#33111

Rollup of 23 pull requests - Successful merges: #33282, #33342, #33393, #33450, #33513, #33517, #33531, #33532, #33533, #33534, #33538, #33541, #33544, #33552, #33555, #33560, #33563, #33565, #33566, #33572, #33580, #33590, #33596 - Failed merges: #33578

@Aatch

… r=Aatch Only break critical edges where actually needed Currently, to prepare for MIR trans, we break _all_ critical edges, although we only actually need to do this for edges originating from a call that gets translated to an invoke instruction in LLVM. This has the unfortunate effect of undoing a bunch of the things that SimplifyCfg has done. A particularly bad case arises when you have a C-like enum with N variants and a derived PartialEq implementation. In that case, the match on the (&lhs, &rhs) tuple gets translated into nested matches with N arms each and a basic block each, resulting in N² basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but breaking the critical edges means that we go back to N². In nickel.rs, there is such an enum with roughly N=800. So we get about 640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to reduce that to the final "disr_a == disr_b". So before this patch, we had 2.5M lines of IR with 640K basic blocks, which took about about 3.6s in LLVM to get optimized and translated. After this patch, we get about 650K lines with about 1.6K basic blocks and spent a little less than 0.2s in LLVM. cc rust-lang#33111 r? @Aatch

[MIR trans] Optimize trans for biased switches Currently, all switches in MIR are exhausitive, meaning that we can have a lot of arms that all go to the same basic block, the extreme case being an if-let expression which results in just 2 possible cases, be might end up with hundreds of arms for large enums. To improve this situation and give LLVM less code to chew on, we can detect whether there's a pre-dominant target basic block in a switch and then promote this to be the default target, not translating the corresponding arms at all. In combination with rust-lang#33544 this makes unoptimized MIR trans of nickel.rs as fast as using old trans and greatly improves the times for optimized builds, which are only 30-40% slower instead of ~300%. cc rust-lang#33111

bors · 2016-05-14T09:57:40Z

⌛ Testing commit 00f6513 with merge ab08af1...

@Aatch

… r=Aatch Only break critical edges where actually needed Currently, to prepare for MIR trans, we break _all_ critical edges, although we only actually need to do this for edges originating from a call that gets translated to an invoke instruction in LLVM. This has the unfortunate effect of undoing a bunch of the things that SimplifyCfg has done. A particularly bad case arises when you have a C-like enum with N variants and a derived PartialEq implementation. In that case, the match on the (&lhs, &rhs) tuple gets translated into nested matches with N arms each and a basic block each, resulting in N² basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but breaking the critical edges means that we go back to N². In nickel.rs, there is such an enum with roughly N=800. So we get about 640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to reduce that to the final "disr_a == disr_b". So before this patch, we had 2.5M lines of IR with 640K basic blocks, which took about about 3.6s in LLVM to get optimized and translated. After this patch, we get about 650K lines with about 1.6K basic blocks and spent a little less than 0.2s in LLVM. cc rust-lang#33111 r? @Aatch

[MIR trans] Optimize trans for biased switches Currently, all switches in MIR are exhausitive, meaning that we can have a lot of arms that all go to the same basic block, the extreme case being an if-let expression which results in just 2 possible cases, be might end up with hundreds of arms for large enums. To improve this situation and give LLVM less code to chew on, we can detect whether there's a pre-dominant target basic block in a switch and then promote this to be the default target, not translating the corresponding arms at all. In combination with rust-lang#33544 this makes unoptimized MIR trans of nickel.rs as fast as using old trans and greatly improves the times for optimized builds, which are only 30-40% slower instead of ~300%. cc rust-lang#33111

bors · 2016-05-14T11:24:33Z

💔 Test failed - auto-mac-64-opt-rustbuild

Rollup of 9 pull requests - Successful merges: #33544, #33552, #33554, #33555, #33560, #33566, #33572, #33574, #33576 - Failed merges:

rust-highfive assigned Aatch May 10, 2016

Aatch reviewed May 10, 2016
View reviewed changes

dotdash force-pushed the baby_dont_break_me_no_more branch from b16904b to 3a475e4 Compare May 10, 2016 19:41

arielb1 reviewed May 10, 2016
View reviewed changes

dotdash force-pushed the baby_dont_break_me_no_more branch 2 times, most recently from 2cd4ca9 to 6e04944 Compare May 10, 2016 20:34

dotdash force-pushed the baby_dont_break_me_no_more branch from 6e04944 to 00f6513 Compare May 11, 2016 16:40

dotdash mentioned this pull request May 11, 2016

[MIR trans] Optimize trans for biased switches #33566

Merged

Manishearth mentioned this pull request May 12, 2016

Rollup of 11 pull requests #33582

Closed

eddyb mentioned this pull request May 12, 2016

Rollup of 15 pull requests #33589

Closed

eddyb mentioned this pull request May 12, 2016

Rollup of 21 pull requests #33595

Closed

eddyb mentioned this pull request May 12, 2016

Rollup of 23 pull requests #33597

Closed

eddyb mentioned this pull request May 13, 2016

Rollup of 29 pull requests #33610

Closed

Manishearth mentioned this pull request May 14, 2016

Rollup of 9 pull requests #33632

Merged

bors added a commit that referenced this pull request May 14, 2016

Auto merge of #33632 - Manishearth:rollup, r=Manishearth

6ba8a1a

Rollup of 9 pull requests - Successful merges: #33544, #33552, #33554, #33555, #33560, #33566, #33572, #33574, #33576 - Failed merges:

bors merged commit 00f6513 into rust-lang:master May 14, 2016

dotdash deleted the baby_dont_break_me_no_more branch May 17, 2016 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only break critical edges where actually needed #33544

Only break critical edges where actually needed #33544

dotdash commented May 10, 2016

Aatch May 10, 2016

dotdash May 10, 2016 •

edited

Loading

arielb1 May 10, 2016

Aatch commented May 10, 2016

dotdash commented May 11, 2016 •

edited

Loading

bors commented May 11, 2016

bors commented May 11, 2016

dotdash commented May 11, 2016

bors commented May 11, 2016

bors commented May 14, 2016

bors commented May 14, 2016

Only break critical edges where actually needed #33544

Only break critical edges where actually needed #33544

Conversation

dotdash commented May 10, 2016

Aatch May 10, 2016

Choose a reason for hiding this comment

dotdash May 10, 2016 • edited Loading

Choose a reason for hiding this comment

arielb1 May 10, 2016

Choose a reason for hiding this comment

Aatch commented May 10, 2016

dotdash commented May 11, 2016 • edited Loading

bors commented May 11, 2016

bors commented May 11, 2016

dotdash commented May 11, 2016

bors commented May 11, 2016

bors commented May 14, 2016

bors commented May 14, 2016

dotdash May 10, 2016 •

edited

Loading

dotdash commented May 11, 2016 •

edited

Loading