-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clone states only when necessary #4279
Conversation
@@ -27,6 +27,7 @@ export function stateTransition( | |||
const block = signedBlock.message; | |||
const blockSlot = block.slot; | |||
|
|||
// .clone() before mutating state in state transition | |||
let postState = state.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and line 85 are the only two instances where state is cloned
Performance Report✔️ no performance regression detected Full benchmark results
|
export function initiateValidatorExit(state: CachedBeaconStateAllForks, validator: phase0.Validator): void { | ||
export function initiateValidatorExit( | ||
state: CachedBeaconStateAllForks, | ||
validator: CompositeViewDU<typeof ssz.phase0.Validator> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using CompositeViewDU
to express the need for mutability here
postState = processSlots(postState, nextEpochSlot, metrics); | ||
|
||
// Cache state to preserve epoch transition work | ||
const checkpointState = postState.clone(); | ||
const checkpointState = postState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dapplion I think it's safer to do postState.clone(false)
?
from your implementation
clone(dontTransferCache?: boolean): this {
if (dontTransferCache) {
return this.type.getViewDU(this.node) as this;
} else {
const cache = this.cache;
this.clearCache();
return this.type.getViewDU(this.node, cache) as this;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't follow, what would it be safer to do a clone here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tuyennhv The point of this PR is to think critically about when our state can actually be mutated. My opinion is that we clone is too many places that a state can't be mutated so cloning is an unnecessary expense that slows than Lodestar for no gain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dapplion in the past we used to have an issue of missing a clone()
call, it really took time to debug in that case, @wemeetagain may experience it
so what I mean is instead of dropping the clone()
call, can we do clone(false)
with less performance impact?
if both clone()
and clone(false)
cause performance issue then we need to benchmark to improve performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My confusion: I thought the cache is merely used for the mutation, it also contains the index of nodes so that we only have to traverse to the node once.
if we use clone(false)
, the consumer does not have a chance to use the cached nodes
of the tree so it's all bad going with either clone() or clone(false)
with ssz v2, we should not ever have a state mutation unless in state transition, thanks @dapplion for your explanation 👍
|
||
// Clone first to account for metrics below | ||
const itemCloned = item.clone(); | ||
|
||
this.metrics?.stateClonedCount.observe(item.clonedCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we don't call clone()
anymore, should metric observe clonedCount
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inspected all the state.clone()
call, I think the change is safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but might be worth deploying to a node to test first
Oh, just noticed the metrics above |
Motivation
SSZ v2 introduced a new caching strategy:
Description
To prevent losing the cache to only-read clones, clone the state only when it's mutated via a state transition run
@wemeetagain @tuyennhv This PR can have serious implications, please review in depth and think of corner cases. This should be tested on nodes too