-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce NW BatchV2 in Protocol Version 12 & Pipe ProtocolConfig into NW #12178
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking at batch_fetcher and batch_maker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall.
// TODO: Remove once we have upgraded to protocol version 12. | ||
if self.protocol_config.narwhal_versioned_metadata() { | ||
// Set received_at timestamp for remote batches. | ||
let mut updated_new_batches = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make new_batches
mut
and update them inplace? It would avoid the copy and maybe a bit of duplicated logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had issues trying to do this before which is why I had to do it this way, but I will follow up in a separate PR if I can get it to work in place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. As we discussed offline let's also confirm that the current cross-epoch network protection mechanism works which should have made us avoid the additional epoch changes.
Reply |
4c442e1
to
f201df1
Compare
dfc871c
to
f8d79f6
Compare
f8d79f6
to
10037d3
Compare
…NW VersionedMetadata field
and add epoch to BatchV2 to be included in digest
Fix tests
Remove epoch from BatchV2 Fix rustfmt errors
49696b7
to
3829c6a
Compare
Description
This is attempt 2 of getting protocol config into narwhal. Previous attempt (PR#11519) had to be reverted because mismatched batch versions were causing the validators to panic. The issue was that we were not handling the protocol upgrade correctly.
ProtocolConfig
was passed in whenNarwhalManager
was created but on epoch changemonitor_reconfiguration
does not recreateNarwhalManager
if it still has a handle on theValidatorComponents
but rather just callsstart
from the existingNarwhalManager
. This is not a problem forNarwhalConfiguration
parameters which is also only passed in onNarwhalManager
creation because the moment the node is restarted for a binary update the parameters take effect. However in the case of protocol upgradesProtocolConfig
is only updated on the following epoch.For example if the validator restarts to update its binary from version
N
to versionN+1
,NarwhalManager
would be constructed withProtocolConfig
at versionN
(notN+1
) because we need to ensure we have a majority quorum before actually going to versionN+1
which happens at epoch change. On epoch change because the node still has a handle onValidatorComponents
it just starts Narwhal with the existingNarwhalManager
and ends up usingProtocolConfig
at versionN
which is the root cause of the issues.To resolve this the following changes were added to the last PR to fix the issue and make it more robust
ProtocolConfig
toNarwhalManager
on start and not on creationTest Plan
Added unit tests & tested protocol upgrade in labnet from mainnet release branch to v12
Type of Change (Check all that apply)
Release notes
Start using
BatchV2
in Narwhal which introducesVersionedMetadata
that allows for more granular tracking of NW batch execution latency.