coverage: Memoize and simplify counter expressions #125106

Zalathar · 2024-05-14T04:15:45Z

When creating coverage counter expressions as part of coverage instrumentation, we often end up creating obviously-redundant expressions like c1 + (c0 - c1), which is equivalent to just c0.

To avoid doing so, this PR checks when we would create an expression matching one of 5 patterns, and uses the simplified form instead:

(a - b) + b → a.
(a + b) - b → a.
(a + b) - a → b.
a + (b - a) → b.
a - (a - b) → b.

Of all the different ways to combine 3 operands and 2 operators, these are the patterns that allow simplification.

(Some of those patterns currently don't occur in practice, but are included anyway for completeness, to avoid having to add them later as branch coverage and MC/DC coverage support expands.)

This PR also adds memoization for newly-created (or newly-simplified) counter expressions, to avoid creating duplicates.

This currently makes no difference to the final mappings, but is expected to be useful for MC/DC coverage of match expressions, as proposed by #124278 (comment).

This currently has no effect, but is expected to be useful when expanding support for branch coverage and MC/DC coverage.

Some of these cases currently don't occur in practice, but are included for completeness, and to avoid having to add them later as branch coverage and MC/DC coverage start building more complex expressions.

rustbot · 2024-05-14T04:15:52Z

r? @davidtwco

rustbot has assigned @davidtwco.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2024-05-14T04:15:54Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Zalathar · 2024-05-14T04:19:58Z

@ZhuUx This is my proposal for how to add counter expression simplification.

It is inspired by #124154 and by your proposed patch from #124278 (comment), but is implemented a bit differently.

I decided to include all possible 3-operand simplifications, even though some of them currently don't occur in practice, to avoid having to keep adding more as branch coverage and MC/DC start introducing more complex expressions. We can always get rid of the useless ones later, after that work is more mature.

Zalathar · 2024-05-14T04:24:55Z

tests/coverage/lazy_boolean.cov-map

 Number of files: 1
 - file 0 => global file 1
-Number of expressions: 164
+Number of expressions: 7


Most of these improvements are pretty modest, but this particular test sees 164 counter expressions reduced to just 7!

ZhuUx · 2024-05-14T06:16:42Z

Cool this should be more powerful than my expected. Since expressions do not cause runtime overhead in general, we'd better not cost much on simplifying it. This patch probably is the best choice to balance between cost and effectiveness.

I have tried other more aggressive simplifying methods like trie tree and a np algorithm, they do not give more drastic optimization compared to this pr but cost much more. Because most computational expressions are located in blocks that are relatively adjacent on the control flow graph, we can believe 3-operands simplification is enough in most cases.

Swatinem · 2024-05-14T07:21:19Z

tests/coverage/drop_trait.cov-map

- Code(Expression(0, Add)) at (prev + 5, 1) to (start + 0, 2)
-    = (c1 + Zero)


Any explanation of why this (essentially Counter(1)) is turned into Counter(0) instead?
This same pattern turns up in a bunch of files.

In general an expression can be lowered to zero due to some constants. For example, partial mir of this test function is

coverage ExpressionId(0) => Expression { lhs: Counter(0), op: Subtract, rhs: Counter(1) }; coverage ExpressionId(1) => Expression { lhs: Counter(1), op: Add, rhs: Expression(0) }; bb0: { Coverage::CounterIncrement(0); // ... _3 = const true; switchInt(move _3) -> [0: bb4, otherwise: bb1]; } bb1: { Coverage::CounterIncrement(1); // ... } bb4: { Coverage::ExpressionUsed(0); // ... }

Since _3 is always true, we have Counter 1 == Counter 0 and Expression 0 can be lowered to Zero.
Then Expression 1 := Counter 1 + Expression 0, and we can see (c1 + Zero) here in previous.

This patch simplifies expressions before any expression is lowered to Zero. Therefore we get Expression 1:= Counter 1 + Expression 0 = Counter 1 + (Counter 0 - Counter 1) = Counter 0 first. Hence later it is not related to c1 or Zero but it should have same value.

Ah, I see. So in other words:

Without expression simplification, this would have been c1 + (c0 - c1).

Then the block containing (c0 - c1) is removed by MIR opts, so coverage codegen replaces it with Zero, and we get c1 + Zero.

With simplification, we just immediately get c1, since c0 cancels itself out.

RenjiSann · 2024-05-14T15:00:28Z

I am not 100% sure, but I remember seeing a simplification step operated on the LLVM side.
If the simplication does the same thing as we do, do we still need to do it on the Rust side, or can we let LLVM handle it ?

See this function that is used during codegen: https://github.com/llvm/llvm-project/blob/d9db2664994ff672f50d7fd0117477935dac04f1/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L77

Zalathar · 2024-05-15T01:21:20Z

I am not 100% sure, but I remember seeing a simplification step operated on the LLVM side. If the simplication does the same thing as we do, do we still need to do it on the Rust side, or can we let LLVM handle it ?

See this function that is used during codegen: https://github.com/llvm/llvm-project/blob/d9db2664994ff672f50d7fd0117477935dac04f1/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L77

Ah, I remember having seen this code in the past, but I had forgotten about it until now.

I believe the reason we don't benefit from this simplification step is that (unlike clang) we never actually use CounterExpressionBuilder; instead we build a list of expressions and pass it directly to CoverageMappingWriter.

(So LLVM still removes unused expressions for us, but it won't simplify the ones that are used.)

Changing our FFI code to use CounterExpressionBuilder seems like more trouble than it's worth. If we wanted to perform the same simplification, it would probably be easier to reimplement it on the Rust side.

But doing so would require deeper changes to how we store and manipulate expressions in the instrumentor (or more complexity in codegen). The advantage of this PR is that it's a very “drop-in” solution, easy to add now, and (hopefully) easy to remove later if we switch over to a more thorough approach to simplifying expressions.

davidtwco · 2024-05-20T14:30:51Z

@bors r+

bors · 2024-05-20T14:30:54Z

📌 Commit d01df6f has been approved by davidtwco

It is now in the queue for this repository.

…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#124682 (Suggest setting lifetime in borrowck error involving types with elided lifetimes) - rust-lang#124917 (Check whether the next_node is else-less if in get_return_block) - rust-lang#125106 (coverage: Memoize and simplify counter expressions) - rust-lang#125173 (Remove `Rvalue::CheckedBinaryOp`) - rust-lang#125305 (add some codegen tests for issue 120493) - rust-lang#125314 ( Add an experimental feature gate for global registration) - rust-lang#125318 (Migrate `run-make/rustdoc-scrape-examples-whitespace` to `rmake.rs`) r? `@ghost` `@rustbot` modify labels: rollup

Rollup merge of rust-lang#125106 - Zalathar:expressions, r=davidtwco coverage: Memoize and simplify counter expressions When creating coverage counter expressions as part of coverage instrumentation, we often end up creating obviously-redundant expressions like `c1 + (c0 - c1)`, which is equivalent to just `c0`. To avoid doing so, this PR checks when we would create an expression matching one of 5 patterns, and uses the simplified form instead: - `(a - b) + b` → `a`. - `(a + b) - b` → `a`. - `(a + b) - a` → `b`. - `a + (b - a)` → `b`. - `a - (a - b)` → `b`. Of all the different ways to combine 3 operands and 2 operators, these are the patterns that allow simplification. (Some of those patterns currently don't occur in practice, but are included anyway for completeness, to avoid having to add them later as branch coverage and MC/DC coverage support expands.) --- This PR also adds memoization for newly-created (or newly-simplified) counter expressions, to avoid creating duplicates. This currently makes no difference to the final mappings, but is expected to be useful for MC/DC coverage of match expressions, as proposed by rust-lang#124278 (comment).

Zalathar added 3 commits May 14, 2024 13:57

coverage: Store expression operands as BcbCounter

1a3a54c

coverage: Memoize newly-created counter expressions

a68bb5e

This currently has no effect, but is expected to be useful when expanding support for branch coverage and MC/DC coverage.

coverage: Simplify counter expressions using simple algebra

d01df6f

Some of these cases currently don't occur in practice, but are included for completeness, and to avoid having to add them later as branch coverage and MC/DC coverage start building more complex expressions.

rustbot assigned davidtwco May 14, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 14, 2024

rustbot added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label May 14, 2024

Zalathar mentioned this pull request May 14, 2024

Support mcdc analysis for pattern matching #124278

Open

Zalathar commented May 14, 2024

View reviewed changes

Swatinem reviewed May 14, 2024

View reviewed changes

davidtwco approved these changes May 20, 2024

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 20, 2024

matthiaskrgr mentioned this pull request May 20, 2024

Rollup of 7 pull requests #125331

Merged

bors merged commit e0d9228 into rust-lang:master May 20, 2024
6 checks passed

rustbot added this to the 1.80.0 milestone May 20, 2024

Zalathar deleted the expressions branch May 20, 2024 23:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coverage: Memoize and simplify counter expressions #125106

coverage: Memoize and simplify counter expressions #125106

Zalathar commented May 14, 2024 •

edited

Loading

rustbot commented May 14, 2024

rustbot commented May 14, 2024

Zalathar commented May 14, 2024

Zalathar May 14, 2024

ZhuUx commented May 14, 2024 •

edited

Loading

Swatinem May 14, 2024

ZhuUx May 14, 2024 •

edited

Loading

Zalathar May 14, 2024

RenjiSann commented May 14, 2024

Zalathar commented May 15, 2024

davidtwco commented May 20, 2024

bors commented May 20, 2024

		- Code(Expression(0, Add)) at (prev + 5, 1) to (start + 0, 2)
		= (c1 + Zero)

coverage: Memoize and simplify counter expressions #125106

coverage: Memoize and simplify counter expressions #125106

Conversation

Zalathar commented May 14, 2024 • edited Loading

rustbot commented May 14, 2024

rustbot commented May 14, 2024

Zalathar commented May 14, 2024

Zalathar May 14, 2024

Choose a reason for hiding this comment

ZhuUx commented May 14, 2024 • edited Loading

Swatinem May 14, 2024

Choose a reason for hiding this comment

ZhuUx May 14, 2024 • edited Loading

Choose a reason for hiding this comment

Zalathar May 14, 2024

Choose a reason for hiding this comment

RenjiSann commented May 14, 2024

Zalathar commented May 15, 2024

davidtwco commented May 20, 2024

bors commented May 20, 2024

Zalathar commented May 14, 2024 •

edited

Loading

ZhuUx commented May 14, 2024 •

edited

Loading

ZhuUx May 14, 2024 •

edited

Loading