gvn: Promote/propagate const local array #126444

tesuji · 2024-06-13T22:01:58Z

Rewriting of #125916 which used PromoteTemps pass.

This allows promoting constant local arrays as anonymous constants. So that's in codegen for
a local array, rustc outputs llvm.memcpy (which is easy for LLVM to optimize) instead of a series
of store on stack (a.k.a in-place initialization). This makes rustc on par with clang on this specific case.
See more in #73825 or zulip for more info.

Here is a simple micro benchmark that shows the performance differences between promoting arrays or not.

Prior discussions on zulip.

This patch saves about -50.36 KiB (-0.038%) of librustc_driver.so.

Fix #73825

rustbot · 2024-06-13T22:02:01Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

compiler/rustc_const_eval/src/interpret/place.rs

compiler/rustc_mir_transform/src/gvn.rs

jieyouxu · 2024-06-13T22:52:18Z

@bors try @rust-timer queue

[WIP] gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used PromoteTemps pass. Fix rust-lang#73825 ### Current status - [ ] Waiting for [consensus](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F). r? ghost

bors · 2024-06-13T22:53:30Z

⌛ Trying commit 550fb81 with merge e26c0b3...

bors · 2024-06-14T00:31:34Z

☀️ Try build successful - checks-actions
Build commit: e26c0b3 (e26c0b3f8c9af007281a11df56a0bf825d8b4cb0)

rust-timer · 2024-06-14T01:45:17Z

Finished benchmarking commit (e26c0b3): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 3.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	9.2%	[4.2%, 18.7%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-9.1%	[-9.9%, -8.3%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.1%	[-9.9%, 18.7%]	6

Cycles

Results (primary -4.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.0%	[-5.3%, -1.7%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-4.0%	[-5.3%, -1.7%]	5

Binary size

Results (primary -0.1%, secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.0%]	1
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-0.2%, 0.0%]	6

Bootstrap: 671.518s -> 674.73s (0.48%)
Artifact size: 320.38 MiB -> 320.39 MiB (0.00%)

Kobzol · 2024-06-14T13:51:30Z

@bors try @rust-timer queue

bors · 2024-06-14T13:52:42Z

⌛ Trying commit 6c6de58 with merge 7e160d4...

[WIP] gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used PromoteTemps pass. Fix rust-lang#73825 ### Current status - [ ] Waiting for [consensus](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F). r? ghost

bors · 2024-06-14T15:25:23Z

☀️ Try build successful - checks-actions
Build commit: 7e160d4 (7e160d4b55bb5a27be0696f45db247ccc2e166d9)

rust-timer · 2024-06-14T16:42:14Z

Finished benchmarking commit (7e160d4): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	11.0%	[3.9%, 21.3%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-7.6%	[-10.4%, -4.8%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[-10.4%, 21.3%]	6

Cycles

Results (secondary 9.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	9.0%	[9.0%, 9.0%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.0%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-0.2%, 0.0%]	6

Bootstrap: 670.938s -> 673.147s (0.33%)
Artifact size: 320.39 MiB -> 319.79 MiB (-0.19%)

BoxyUwU · 2024-07-14T06:10:29Z

@bors try @rust-timer queue

bors · 2024-07-14T06:11:40Z

⌛ Trying commit 8672700 with merge ac2e9cd...

gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used `PromoteTemps` pass. This allows promoting constant local arrays as anonymous constants. So that's in codegen for a local array, rustc outputs `llvm.memcpy` (which is easy for LLVM to optimize) instead of a series of `store` on stack (a.k.a in-place initialization). This makes rustc on par with clang on this specific case. See more in rust-lang#73825 or [zulip][opsem] for more info. [Here is a simple micro benchmark][bench] that shows the performance differences between promoting arrays or not. [Prior discussions on zulip][opsem]. This patch [saves about 600 KB][perf] (~0.5%) of `librustc_driver.so`. ![image](https://github.com/rust-lang/rust/assets/15225902/0e37559c-f5d9-4cdf-b7e3-a2956fd17bc1) Fix rust-lang#73825 r? cjgillot ### Unresolved questions - [ ] Should we ignore nested arrays? I think that promoting nested arrays is bloating codegen. - [ ] Should stack_threshold be at least 32 bytes? Like the benchmark showed. If yes, the test should be updated to make arrays larger than 32 bytes. - [x] ~Is this concerning that `call(move _1)` is now `call(const [array])`?~ It reverted back to `call(move _1)` [opsem]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F [bench]: rust-lang/rust-clippy#12854 (comment) [perf]: https://perf.rust-lang.org/compare.html?start=f9515fdd5aa132e27d9b580a35b27f4b453251c1&end=7e160d4b55bb5a27be0696f45db247ccc2e166d9&stat=size%3Alinked_artifact&tab=artifact-size

bors · 2024-07-14T08:03:24Z

☀️ Try build successful - checks-actions
Build commit: ac2e9cd (ac2e9cd42525cb1be45517156e5c5dbd10dc5a0e)

rust-timer · 2024-07-14T09:25:47Z

Finished benchmarking commit (ac2e9cd): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.4%	[-0.4%, -0.4%]	1
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.2%]	1
All ❌✅ (primary)	-0.4%	[-0.4%, -0.4%]	1

Max RSS (memory usage)

Results (primary -1.2%, secondary -0.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.8%	[2.8%, 2.8%]	1
Regressions ❌ (secondary)	5.0%	[4.4%, 5.6%]	2
Improvements ✅ (primary)	-5.2%	[-5.2%, -5.2%]	1
Improvements ✅ (secondary)	-4.5%	[-5.3%, -3.2%]	3
All ❌✅ (primary)	-1.2%	[-5.2%, 2.8%]	2

Cycles

Results (secondary -2.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.3%	[-2.3%, -2.3%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.1%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.1%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.4%, -0.0%]	3
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	12
All ❌✅ (primary)	-0.1%	[-0.4%, 0.1%]	5

Bootstrap: 705.571s -> 705.908s (0.05%)
Artifact size: 328.69 MiB -> 328.62 MiB (-0.02%)

Kobzol · 2024-07-14T15:42:40Z

@bors try @rust-timer queue

gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used `PromoteTemps` pass. This allows promoting constant local arrays as anonymous constants. So that's in codegen for a local array, rustc outputs `llvm.memcpy` (which is easy for LLVM to optimize) instead of a series of `store` on stack (a.k.a in-place initialization). This makes rustc on par with clang on this specific case. See more in rust-lang#73825 or [zulip][opsem] for more info. [Here is a simple micro benchmark][bench] that shows the performance differences between promoting arrays or not. [Prior discussions on zulip][opsem]. This patch [saves about 600 KB][perf] (~0.5%) of `librustc_driver.so`. ![image](https://github.com/rust-lang/rust/assets/15225902/0e37559c-f5d9-4cdf-b7e3-a2956fd17bc1) Fix rust-lang#73825 r? cjgillot ### Unresolved questions - [ ] Should we ignore nested arrays? I think that promoting nested arrays is bloating codegen. - [ ] Should stack_threshold be at least 32 bytes? Like the benchmark showed. If yes, the test should be updated to make arrays larger than 32 bytes. - [x] ~Is this concerning that `call(move _1)` is now `call(const [array])`?~ It reverted back to `call(move _1)` [opsem]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F [bench]: rust-lang/rust-clippy#12854 (comment) [perf]: https://perf.rust-lang.org/compare.html?start=f9515fdd5aa132e27d9b580a35b27f4b453251c1&end=7e160d4b55bb5a27be0696f45db247ccc2e166d9&stat=size%3Alinked_artifact&tab=artifact-size

bors · 2024-07-14T15:43:51Z

⌛ Trying commit c15eb60 with merge b6d6d25...

bors · 2024-07-14T17:33:42Z

☀️ Try build successful - checks-actions
Build commit: b6d6d25 (b6d6d25a7e03cda5b6e133fd6541106859d1489d)

rust-timer · 2024-07-14T19:33:58Z

Finished benchmarking commit (b6d6d25): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-0.5%, -0.5%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -1.5%, secondary 2.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.7%	[3.8%, 5.9%]	3
Improvements ✅ (primary)	-1.5%	[-1.7%, -1.3%]	2
Improvements ✅ (secondary)	-5.6%	[-5.6%, -5.6%]	1
All ❌✅ (primary)	-1.5%	[-1.7%, -1.3%]	2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	5
All ❌✅ (primary)	-	-	0

Bootstrap: 699.561s -> 699.867s (0.04%)
Artifact size: 328.65 MiB -> 328.59 MiB (-0.02%)

tesuji · 2024-07-15T06:03:01Z

From the last perf. run, it seems that there are no performance advantages to avoid LLVM de-duplicating arrays.
I reverted that changes and squashed all commits for the final review.

cjgillot · 2024-08-03T16:42:14Z

compiler/rustc_mir_transform/src/gvn.rs

@@ -418,9 +421,7 @@ impl<'body, 'tcx> VnState<'body, 'tcx> {
                        self.ecx.copy_op(op, &field_dest).ok()?;
                    }
                    self.ecx.write_discriminant(variant.unwrap_or(FIRST_VARIANT), &dest).ok()?;
-                    self.ecx
-                        .alloc_mark_immutable(dest.ptr().provenance.unwrap().alloc_id())
-                        .ok()?;


Why do we stop marking as immutable?

Correct me if I'm wrong but I think let dest = dest.map_provenance(|prov| prov.as_immutable()); in the line below could serve the same purpose.

tesuji · 2024-08-22T14:50:03Z

Well! Nothing to do here!

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 13, 2024

scottmcm reviewed Jun 13, 2024

View reviewed changes

compiler/rustc_const_eval/src/interpret/place.rs Outdated Show resolved Hide resolved

scottmcm reviewed Jun 13, 2024

View reviewed changes

compiler/rustc_mir_transform/src/gvn.rs Outdated Show resolved Hide resolved

tesuji changed the title ~~WIP: Trying to promote const local array using GVN pass~~ WIP: Promote/propagate const local array using GVN pass Jun 13, 2024

tesuji force-pushed the gvn-const-arrays branch from e8f832f to 5944765 Compare June 13, 2024 22:32

tesuji changed the title ~~WIP: Promote/propagate const local array using GVN pass~~ [WIP] gvn: Promote/propagate const local array Jun 13, 2024

This comment has been minimized.

Sign in to view

tesuji force-pushed the gvn-const-arrays branch from 5944765 to 550fb81 Compare June 13, 2024 22:43

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 13, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 14, 2024

tesuji mentioned this pull request Jun 14, 2024

[WIP] mir-opt: promoting const read-only arrays #125916

Closed

4 tasks

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 14, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 14, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 14, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 14, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 14, 2024

tesuji added 3 commits July 15, 2024 05:56

add regression tests for gvn of local arrays

023838f

add codegen test for issue 73825

5203642

gvn: promote propagatable const local arrays

7b96b9c

tesuji force-pushed the gvn-const-arrays branch from c15eb60 to 7b96b9c Compare July 15, 2024 05:56

cjgillot reviewed Aug 3, 2024

View reviewed changes

Dylan-DPC added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 20, 2024

tesuji closed this Aug 22, 2024

tesuji deleted the gvn-const-arrays branch August 22, 2024 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gvn: Promote/propagate const local array #126444

gvn: Promote/propagate const local array #126444

tesuji commented Jun 13, 2024 •

edited

Loading

rustbot commented Jun 13, 2024

This comment has been minimized.

jieyouxu commented Jun 13, 2024

This comment has been minimized.

bors commented Jun 13, 2024

This comment has been minimized.

bors commented Jun 14, 2024

This comment has been minimized.

rust-timer commented Jun 14, 2024

This comment has been minimized.

Kobzol commented Jun 14, 2024

This comment has been minimized.

bors commented Jun 14, 2024

bors commented Jun 14, 2024

This comment has been minimized.

rust-timer commented Jun 14, 2024

BoxyUwU commented Jul 14, 2024

This comment has been minimized.

bors commented Jul 14, 2024

bors commented Jul 14, 2024

This comment has been minimized.

rust-timer commented Jul 14, 2024

This comment has been minimized.

Kobzol commented Jul 14, 2024

This comment has been minimized.

bors commented Jul 14, 2024

bors commented Jul 14, 2024

This comment has been minimized.

rust-timer commented Jul 14, 2024

tesuji commented Jul 15, 2024 •

edited

Loading

cjgillot Aug 3, 2024

tesuji Aug 3, 2024

tesuji commented Aug 22, 2024

gvn: Promote/propagate const local array #126444

gvn: Promote/propagate const local array #126444

Conversation

tesuji commented Jun 13, 2024 • edited Loading

rustbot commented Jun 13, 2024

This comment has been minimized.

jieyouxu commented Jun 13, 2024

This comment has been minimized.

bors commented Jun 13, 2024

This comment has been minimized.

bors commented Jun 14, 2024

This comment has been minimized.

rust-timer commented Jun 14, 2024

Overall result: no relevant changes - no action needed

This comment has been minimized.

Kobzol commented Jun 14, 2024

This comment has been minimized.

bors commented Jun 14, 2024

bors commented Jun 14, 2024

This comment has been minimized.

rust-timer commented Jun 14, 2024

Overall result: no relevant changes - no action needed

BoxyUwU commented Jul 14, 2024

This comment has been minimized.

bors commented Jul 14, 2024

bors commented Jul 14, 2024

This comment has been minimized.

rust-timer commented Jul 14, 2024

Overall result: ✅ improvements - no action needed

This comment has been minimized.

Kobzol commented Jul 14, 2024

This comment has been minimized.

bors commented Jul 14, 2024

bors commented Jul 14, 2024

This comment has been minimized.

rust-timer commented Jul 14, 2024

Overall result: ✅ improvements - no action needed

tesuji commented Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tesuji commented Aug 22, 2024

tesuji commented Jun 13, 2024 •

edited

Loading

tesuji commented Jul 15, 2024 •

edited

Loading