Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

BABE's revert procedure #11022

Merged
merged 15 commits into from
Mar 24, 2022
Merged

BABE's revert procedure #11022

merged 15 commits into from
Mar 24, 2022

Conversation

davxy
Copy link
Member

@davxy davxy commented Mar 13, 2022

Partially addresses #10962

This PR addresses the rollback of BABE's data structures on "revert" command.

In particular this involves the removal of some nodes from the EpochChanges tree.

This functionality is mostly provided via the ForkTree new filter method.

Given a ForkTree instance this method allows to remove nodes (and their subtrees) from the overall tree via a predicate closure.

The predicate drives the filtering behavior, in particular allows to:

  • P1. remove a node (and its subtree);
  • P2. early stop the overall operation;
  • P3. stop the operation for the node under processing, thus skipping its subtree from the operation.

When used by the BABE's revert procedure, given the hash of THE block we what to revert to, the passed predicate:

  1. searches for the highest (wrt tree height) ForkTree's nodes such that THE block is an ancestor of the block that instanced such nodes (announced the epoch). These nodes (and their subtrees) are removed from the tree (leverages P1);
  2. detects a "backtrack", i.e. detects if some nodes were already removed and the search started looking into some higher (wrt tree height) branches. At this point the search can stop since no other nodes can satisfy 1 (leverages P2);
  3. avoid searching into the tree branches that are not containing the block we are looking for (leverages P3).

Known limitations

  • Assuming a strictly monotonic machine clock, revert across more than one epoch breaks BABE's algorithm.

    • the next imported block will have a slot greater than last epoch end slot, thus it will immediately start a new epoch;
    • this new epoch end-slot is already smaller than the imported block slot. Nevertheless the block is imported in this new epoch;
    • following imported blocks will generate the infamous "Unexpected Epoch Changes" error (How to prevent and deal with unexpected epoch changes #4464). This is due to the fact that the block will come with the announcement of another new epoch BUT the parent_slot is greater than epoch descriptor start slot (see here and here)
  • Revert is not atomic wrt blockchain state revert procedure. Currently the two procedures are independent and the blockchain state revert may fail after we successfully reverted BABE structures.


polkadot companion: paritytech/polkadot#5109
cumulus companion: paritytech/cumulus#1089

@github-actions github-actions bot added the A3-in_progress Pull request is in progress. No review needed at this stage. label Mar 13, 2022
@sergejparity
Copy link
Contributor

Please consider to pull latest master into your branch to fix GHA Check labels failure. Location of CI scripts has changed in #11008

@davxy davxy added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Mar 14, 2022
@davxy davxy marked this pull request as ready for review March 14, 2022 15:03
@davxy davxy requested a review from andresilva as a code owner March 14, 2022 15:03
@davxy davxy requested a review from bkchr March 14, 2022 15:03
@github-actions github-actions bot added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Mar 14, 2022
@davxy davxy requested a review from a team March 14, 2022 15:03
@@ -43,16 +43,27 @@ pub struct RevertCmd {
pub pruning_params: PruningParams,
}

/// Revert handler for auxiliary data (e.g. consensus).
type AuxRevertHandler<B> = Box<dyn FnOnce(NumberFor<B>) -> error::Result<()>>;
Copy link
Member Author

@davxy davxy Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aux data revert can be made atomic by returning a Vec<(Vec<u8>, Option<Vec<u8>>)>.

This vector has the same purpose of BlockImportParams::auxiliary for block import, i.e. perform a bunch of operations on aux data atomically together with blockchain state operations.

The only drawback is that currently the blockchain state is reverted one block at a time (see here). If we want an "atomic" revert of "aux-data + blockchain-state" then we will have to call AuxRevertHandler multiple times as well (once for each reverted block)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised we do not support begin/commit/rollback transaction for batching multiple mutations across aux and main state storage already.

@skunert
Copy link
Contributor

skunert commented Mar 16, 2022

Looks Good to me!

Revert is not atomic wrt blockchain state revert procedure. Currently the two procedures are independent and the blockchain state revert may fail after we successfully reverted BABE structures.

Regarding this - If for any reason we fail after reverting the BABE structures and end up in an inconsistent node state, we should still be able to re-execute the revert right?

@davxy
Copy link
Member Author

davxy commented Mar 16, 2022

Regarding this - If for any reason we fail after reverting the BABE structures and end up in an inconsistent node state, we should still be able to re-execute the revert right?

YES, you can retry.

The BABE's "revert" procedure fails only if it cannot find the header of the block we are reverting to (and the headers are still there), so the procedure can be re-executed.

The second time the BABE's "revert" procedure is run, no node will be deleted and it will then re-try blockchain state revert.

As said, because blockchain-state and BABE revert is not atomic, the only caveat is if the first pass of BABE's "revert" deleted some epoch nodes then the chain state is inconsistent until the blockchain-state is reverted as well.

bin/node/cli/src/command.rs Outdated Show resolved Hide resolved
@@ -43,16 +43,27 @@ pub struct RevertCmd {
pub pruning_params: PruningParams,
}

/// Revert handler for auxiliary data (e.g. consensus).
type AuxRevertHandler<B> = Box<dyn FnOnce(NumberFor<B>) -> error::Result<()>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised we do not support begin/commit/rollback transaction for batching multiple mutations across aux and main state storage already.

utils/fork-tree/src/lib.rs Outdated Show resolved Hide resolved
utils/fork-tree/src/lib.rs Outdated Show resolved Hide resolved
client/consensus/epochs/src/lib.rs Outdated Show resolved Hide resolved
client/consensus/babe/src/lib.rs Show resolved Hide resolved
@andresilva
Copy link
Contributor

LGTM, just left some minor comments regarding the filter interface. Thanks @davxy!

@davxy davxy added the D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit label Mar 23, 2022
@davxy
Copy link
Member Author

davxy commented Mar 24, 2022

bot merge

@paritytech-processbot paritytech-processbot bot merged commit 56c28e8 into master Mar 24, 2022
@paritytech-processbot paritytech-processbot bot deleted the davxy-revert-command-fix branch March 24, 2022 08:51
acatangiu added a commit to acatangiu/parity-bridges-common that referenced this pull request Mar 28, 2022
cumulus: 4e95228
polkadot: 975e780ae0d988dc033f400ba822d14b326ee5b9
substrate: 89fcb3e4f62d221d4e161a437768e77d6265889e

bump serde_json to 1.0.79

sync changes from paritytech/substrate#11022

Signed-off-by: acatangiu <[email protected]>
acatangiu added a commit to acatangiu/parity-bridges-common that referenced this pull request Mar 28, 2022
cumulus: b468d0c
polkadot: 827792ca833396c82c726eda0bc2ad32ecddba73
substrate: 666f39b8a22108f57732215de006518738034ba2

bump serde_json to 1.0.79

sync changes from paritytech/substrate#11022

Signed-off-by: acatangiu <[email protected]>
acatangiu added a commit to acatangiu/parity-bridges-common that referenced this pull request Mar 28, 2022
cumulus: b468d0c
polkadot: 827792ca833396c82c726eda0bc2ad32ecddba73
substrate: 666f39b8a22108f57732215de006518738034ba2

bump serde_json to 1.0.79

sync changes from paritytech/substrate#11022

fixed clippy warnings

Signed-off-by: acatangiu <[email protected]>
grishasobol pushed a commit to gear-tech/substrate that referenced this pull request Mar 28, 2022
* First rough draft for BABE revert

* Proper babe revert test

* Cleanup

* Test trivial cleanup

* Fix to make clippy happy

* Check polkadot companion

* Check cumulus companion

* Remove babe's blocks weight on revert

* Handle "empty" blockchain edge case

* Run companions

* Simplify the filter predicate

* Saturating sub is not required

* Run pipeline

* Run pipeline again...
grishasobol pushed a commit to gear-tech/substrate that referenced this pull request Mar 28, 2022
* First rough draft for BABE revert

* Proper babe revert test

* Cleanup

* Test trivial cleanup

* Fix to make clippy happy

* Check polkadot companion

* Check cumulus companion

* Remove babe's blocks weight on revert

* Handle "empty" blockchain edge case

* Run companions

* Simplify the filter predicate

* Saturating sub is not required

* Run pipeline

* Run pipeline again...
grishasobol pushed a commit to gear-tech/substrate that referenced this pull request Mar 28, 2022
* First rough draft for BABE revert

* Proper babe revert test

* Cleanup

* Test trivial cleanup

* Fix to make clippy happy

* Check polkadot companion

* Check cumulus companion

* Remove babe's blocks weight on revert

* Handle "empty" blockchain edge case

* Run companions

* Simplify the filter predicate

* Saturating sub is not required

* Run pipeline

* Run pipeline again...
acatangiu added a commit to paritytech/parity-bridges-common that referenced this pull request Mar 28, 2022
cumulus: b468d0c
polkadot: 827792ca833396c82c726eda0bc2ad32ecddba73
substrate: 666f39b8a22108f57732215de006518738034ba2

bump serde_json to 1.0.79

sync changes from paritytech/substrate#11022

fixed clippy warnings

Signed-off-by: acatangiu <[email protected]>
DaviRain-Su pushed a commit to octopus-network/substrate that referenced this pull request Aug 23, 2022
* First rough draft for BABE revert

* Proper babe revert test

* Cleanup

* Test trivial cleanup

* Fix to make clippy happy

* Check polkadot companion

* Check cumulus companion

* Remove babe's blocks weight on revert

* Handle "empty" blockchain edge case

* Run companions

* Simplify the filter predicate

* Saturating sub is not required

* Run pipeline

* Run pipeline again...
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
* First rough draft for BABE revert

* Proper babe revert test

* Cleanup

* Test trivial cleanup

* Fix to make clippy happy

* Check polkadot companion

* Check cumulus companion

* Remove babe's blocks weight on revert

* Handle "empty" blockchain edge case

* Run companions

* Simplify the filter predicate

* Saturating sub is not required

* Run pipeline

* Run pipeline again...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants