-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert redundant_clone
to an analysis pass
#11364
Conversation
r? @Alexendoo (rustbot has picked a reviewer for you, use r? to override) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments and questions, to gain some perspective on the use case of rust-lang/rust#114900.
My impression is that you are trying to bend the dataflow API to do too much at once.
This analysis could be much simpler, and not require any change to the dataflow API.
clippy_lints/src/redundant_clone.rs
Outdated
let mut ref_target_results = RefTargetAnalysis::new(cx.tcx, cx.param_env, body, &raw_ptr_taken) | ||
.into_engine(cx.tcx, body) | ||
.iterate_to_fixpoint() | ||
.into_results_cursor(body); | ||
let clone_sources = clone_calls | ||
.iter() | ||
.filter_map(|block| { | ||
let data = &body.basic_blocks[block]; | ||
ref_target_results.seek_before_primary_effect(Location { | ||
block, | ||
statement_index: data.statements.len(), | ||
}); | ||
if let TerminatorKind::Call { args, .. } = &data.terminator().kind | ||
&& let [Operand::Copy(arg) | Operand::Move(arg), ..] = &**args | ||
&& let Some(Some(targets)) = ref_target_results.get().targets.get(&arg.local) | ||
{ | ||
Some((block, (arg.local, targets.clone()))) | ||
} else { | ||
None | ||
} | ||
}) | ||
.collect::<FxHashMap<_, _>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using the SSA analysis to do this. The self-argument to the clone call will typically be a temporary created by autoref desugaring, so will be SSA. It will be much faster than a dataflow analysis for typical MIR bodies.
See rustc_mir_transform::ssa::SsaLocals
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically yes. If there ends up being perf problem this would be worth not linting the few extra cases this could catch. Limiting the analysis to just types which are cloned could also get a decent speed up
clippy_lints/src/redundant_clone.rs
Outdated
struct CloneDomain { | ||
links: FxHashSet<(Local, Local, BasicBlock)>, | ||
values: FxHashMap<Local, Value>, | ||
candidates: Candidates, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field should not be part of the analysis domain, but computed in post-processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would work.
clippy_lints/src/redundant_clone.rs
Outdated
#[derive(Debug, Clone)] | ||
struct Value { | ||
last_modified: Option<BasicBlock>, | ||
lint_unread: HybridBitSet<BasicBlock>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field looks like it attempts to model a liveness analysis. Can it be replaced by MayleLiveLocals
that we already have in store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tracking if a read has occurred since the paired value was first modified.
clippy_lints/src/redundant_clone.rs
Outdated
links: FxHashSet<(Local, Local, BasicBlock)>, | ||
values: FxHashMap<Local, Value>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, you are trying to compute which locals store the same value.
Can this be achieved by using an IndexVec<Local, FlatSet<Location>>
?
FlatSet::Bottom
means uninitialized / moved-out.
FlatSet::Elem(location)
means "assigned some unique value created at this location".
FlatSet::Top
means "could store anything".
And in case of an assignment, propagate the FlatSet
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just which locals are duplicates of each other. This also needs to track how the duplicate values are used in relation to each other. We also need to know where the duplicate was made in order to lint that span.
Thanks for taking a look. Assuming pulling It would still be nice to be able to change the links stored in the domain to a bitset and have the analysis convert between the Just as further elaboration the basic idea for the lint it starts from the clone call and tracks both values. When one of the two values is modified, it then switches to checking if the non-modified version is read. If it is, then the clone is considered necessary. e.g. // `x` is assigned a duplicate value of `y`
// a new value in `links` keeps track of the relation and the clone location
let x = y.clone();
// all values linked to `y` have `lint_unread` modified here
y.modify();
// since `y` doesn't have the original value, the clone is necessary
// any clone call in `x`'s `lint_unread` field is a necessary clone
x.read(); If the order of the read and write are swapped the clone call isn't necessary since the read could have occurred on the original value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjgillot I've made some changes to the analysis. Though not strictly required, the changes to the dataflow api would still be useful.
#[derive(Default, Debug, Clone, PartialEq, Eq)] | ||
struct CloneDomain { | ||
// Values which are clones of each other and may still contain the same value. | ||
links: FxHashSet<(Local, Local, BasicBlock)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The links would be better as a BitSet
(one without a fixed domain size) by storing an IndexSet
in the analysis and using each value's index as the BitSet index. This causes a problem with the join as it needs access to the value.
for block in (0..body.basic_blocks.len()).map(BasicBlock::from_usize) { | ||
let state = results.entry_set_for_block(block); | ||
let loc = block.start_location(); | ||
for (&local, _) in state.values.iter().filter(|(_, v)| v.last_modified.is_none()) { | ||
for &(l1, l2, block) in &state.links { | ||
let local = if l1 == local { | ||
l2 | ||
} else if l2 == local { | ||
l1 | ||
} else { | ||
continue; | ||
}; | ||
|
||
if terminator.source_info.span.from_expansion() { | ||
continue; | ||
if !borrowers.bounded_borrowers(&[], &[], local, loc) { | ||
required_clones.insert(block); | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The required_clones.insert(..)
part should occur inside the join, but the PossibleBorrowerMap
can't be accessed from there. This would allow the whole loop to be removed.
clone_sources: &'a FxHashMap<BasicBlock, (Local, HybridBitSet<Local>)>, | ||
link_buf: Vec<(Local, Local, BasicBlock)>, | ||
borrowers: PossibleBorrowerMap<'mir, 'tcx>, | ||
required_clones: FxHashSet<BasicBlock>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be calculated in a ResultsVisitor
thereby resolving the earlier issue with PossibleBorrowerMap
, but that requires the visitor to detect reads and moves the same way the analysis does.
Given that handling each read is a small amount of work, and using a visitor requires a whole extra pass over every block it's probably better to do it this way. It's also just less work to implement/maintain.
struct Value { | ||
// Two values which were last modified at different locations may contain different values. | ||
// Joining two such values needs to be treated as a modification. | ||
last_modified: Option<Location>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really should be something like:
enum Index {
Statement(usize),
Merge(Local),
}
struct Location {
block: BasicBlock,
index: Index,
}
This would allow distinguishing different merged values from each other, but would require join
to know which block it's working on.
To illustrate the problem this causes:
// `x` is merged value (i.e. location = None)
let mut x = if _ { foo() } else { bar() };
// `y` is merged value (i.e. location = None)
let y = if _ { foo2() } else { bar2() };
let z = x.clone();
// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
_ => { ... },
_ => { ... },
_ => { ... },
};
// here the clone isn't needed.
do_something(&z);
// but this needs to be detected as a modification of `x` when joining the states
if _ {
// note `x` is not considered modified here as the value is overwritten.
// it's only considered modified when joining the state after the `if`.
x = y;
}
// here the clone is needed.
do_something(&z);
We're currently in a state where merging None
with None
needs to be considered both a modification and not a modification. The only way out of this is to know both where the join occurred, and which value was joined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would represent a value with a simple index. A first pass collects value roots. Those would be the value of a variable where it is borrowed before cloning.
The state would be something like IndexVec<Local, FlatSet<Value>>
.
let mut x = if _ { foo() } else { bar() };
let y = if _ { foo2() } else { bar2() };
// `x` is borrowed here. `x` is `FlatSet::Elem(Value(0))` here.
let z = x.clone();
// `z` is `FlatSet::Elem(Value(0))`.
// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
_ => { ... },
_ => { ... },
_ => { ... },
};
// `x` and `z` are still both `FlatSet::Elem(Value(0))`
// here the clone isn't needed: `x` and `y` are both `FlatSet::Elem(Value(0))`.
do_something(&z);
if _ {
x = y;
// `x` would be `FlatSet::Top` if `y` is not tracked.
}
// By joining, `x` is `FlatSet::Top`.
// here the clone is needed: `x` is `FlatSet::Top`, while `z` is `FlatSet::Elem(Value(0))`
do_something(&z);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wouldn't handle cloning a value twice, nor would it handle a chain of clones. e.g.
let x = foo();
let y = if bar {
x.clone()
} else {
x.clone()
};
let z = y.clone();
// `x`, `y` and `z` should all be seen as the same value here.
Edit: The clone chain isn't a problem. If the source is already assigned a value, then that one can be used instead of giving a new one. This doesn't solve the problem with clones on different branches though.
None | ||
#[derive(Default, Debug, Clone)] | ||
struct RefTargetDomain { | ||
targets: FxHashMap<Local, Option<HybridBitSet<Local>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that this represents, for each local, the locals it may point to. Is that so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct.
}; | ||
state.targets.insert(dst, targets); | ||
} else if let Rvalue::Use(Operand::Move(src)) | ||
| Rvalue::Ref(_, BorrowKind::Mut { .. }, src) = a.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you treat &mut
as a move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an error. It should be setting the value to None
, not removing it.
Some(targets) | ||
} else { | ||
None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about reborrows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently handled.
None | ||
} | ||
}) | ||
.collect::<FxHashMap<_, _>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this represents, for each block, "this local is a clone one of those locals".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mapping is block -> possible clone sources
. The argument local doesn't strictly need to be there.
targets: FxHashMap<Local, Option<HybridBitSet<Local>>>, | ||
} | ||
impl JoinSemiLattice for RefTargetDomain { | ||
fn join(&mut self, other: &Self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what is the structure of this lattice. Each Local
is independent, so we want a lattice structure for Option<Option<HybridBitSet>>
.
Is it None <= Some(Some(lattice)) <= Some(None)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. Basically uninit < local < maybe non-local
struct Value { | ||
// Two values which were last modified at different locations may contain different values. | ||
// Joining two such values needs to be treated as a modification. | ||
last_modified: Option<Location>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would represent a value with a simple index. A first pass collects value roots. Those would be the value of a variable where it is borrowed before cloning.
The state would be something like IndexVec<Local, FlatSet<Value>>
.
let mut x = if _ { foo() } else { bar() };
let y = if _ { foo2() } else { bar2() };
// `x` is borrowed here. `x` is `FlatSet::Elem(Value(0))` here.
let z = x.clone();
// `z` is `FlatSet::Elem(Value(0))`.
// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
_ => { ... },
_ => { ... },
_ => { ... },
};
// `x` and `z` are still both `FlatSet::Elem(Value(0))`
// here the clone isn't needed: `x` and `y` are both `FlatSet::Elem(Value(0))`.
do_something(&z);
if _ {
x = y;
// `x` would be `FlatSet::Top` if `y` is not tracked.
}
// By joining, `x` is `FlatSet::Top`.
// here the clone is needed: `x` is `FlatSet::Top`, while `z` is `FlatSet::Elem(Value(0))`
do_something(&z);
☔ The latest upstream changes (presumably #11694) made this pull request unmergeable. Please resolve the merge conflicts. |
Hey this is triage, I'm closing this due to inactivity. If you want to continue this implementation, you're welcome to create a new PR. Thank you for the time, you already put into this! Interested parties are welcome to pick this implementation up as well :) @rustbot label +S-inactive-closed -S-waiting-on-author -S-waiting-on-review |
fixes #10940
fixes #10893
fixes #10870
fixes #10577
fixes #10545
fixes #10517
This is mostly finished, but blocked until rust-lang/rust#114900 is merged (or something similar). This isn't worth reviewing until then.
changelog: