Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce basic slot-based collator #4097

Merged
merged 85 commits into from
Jul 5, 2024
Merged

Introduce basic slot-based collator #4097

merged 85 commits into from
Jul 5, 2024

Conversation

skunert
Copy link
Contributor

@skunert skunert commented Apr 12, 2024

Part of #3168
On top of #3568

Changes Overview

  • Introduces a new collator variant in cumulus/client/consensus/aura/src/collators/slot_based/mod.rs
  • Two tasks are part of that module, one for block building and one for collation building and submission.
  • Introduces a new variant of cumulus-test-runtime which has 2s slot duration, used for zombienet testing
  • Zombienet tests for the new collator

Note: This collator is considered experimental and should only be used for testing and exploration for now.

Comparison with lookahead collator

  • The new variant is slot based, meaning it waits for the next slot of the parachain, then starts authoring
  • The search for potential parents remains mostly unchanged from lookahead
  • As anchor, we use the current best relay parent
  • In general, the new collator tends to be anchored to one relay parent earlier. lookahead generally waits for a new relay block to arrive before it attempts to build a block. This means the actual timing of parachain blocks depends on when the relay block has been authored and imported. With the slot-triggered approach we are authoring directly on the slot boundary, were a new relay chain block has probably not yet arrived.

Limitations

  • Overall, the current implementation focuses on the "happy path"
  • We assume that we want to collate close to the tip of the relay chain. It would be useful however to have some kind of configurable drift, so that we could lag behind a bit. slot-based-collator: Allow slot drift #3965
  • The collation task is pretty dumb currently. It checks if we have cores scheduled and if yes, submits all the messages we have received from the block builder until we have something submitted for every core. Ideally we should do some extra checks, i.e. we do not need to submit if the built block is already too old (build on a out of range relay parent) or was authored with a relay parent that is not an ancestor of the relay block we are submitting at. slot-based-collator: Collation & block-builder communication #3966
  • There is no throttling, we assume that we can submit velocity blocks every relay chain block. There should be communication between the collator task and block-builder task.
  • The parent search and ConsensusHook are not yet properly adjusted. The parent search makes assumptions about the pending candidate which no longer hold. slot-based-collator: Adjust ConsensusHook and parent search #3967
  • Custom triggers for block building not implemented.

@skunert skunert added T0-node This PR/Issue is related to the topic “node”. T9-cumulus This PR/Issue is related to cumulus. labels Apr 12, 2024
@skunert skunert self-assigned this Apr 12, 2024
@skunert skunert requested review from andresilva and a team as code owners April 12, 2024 10:43
@skunert
Copy link
Contributor Author

skunert commented Jul 4, 2024

bot fmt

@command-bot
Copy link

command-bot bot commented Jul 4, 2024

@skunert https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6630012 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/fmt/fmt.sh". Check out https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 1-70747836-c687-4b63-97ad-9503f0118afa to cancel this command or bot cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Jul 4, 2024

@skunert Command "$PIPELINE_SCRIPTS_DIR/commands/fmt/fmt.sh" has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6630012 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6630012/artifacts/download.

@skunert skunert enabled auto-merge July 5, 2024 08:30
@skunert skunert added this pull request to the merge queue Jul 5, 2024
Merged via the queue into master with commit e44f61a Jul 5, 2024
158 of 160 checks passed
@skunert skunert deleted the slot-based-mvp branch July 5, 2024 09:26
TomaszWaszczyk pushed a commit to TomaszWaszczyk/polkadot-sdk that referenced this pull request Jul 7, 2024
Part of paritytech#3168 
On top of paritytech#3568

### Changes Overview
- Introduces a new collator variant in
`cumulus/client/consensus/aura/src/collators/slot_based/mod.rs`
- Two tasks are part of that module, one for block building and one for
collation building and submission.
- Introduces a new variant of `cumulus-test-runtime` which has 2s slot
duration, used for zombienet testing
- Zombienet tests for the new collator

**Note:** This collator is considered experimental and should only be
used for testing and exploration for now.

### Comparison with `lookahead` collator
- The new variant is slot based, meaning it waits for the next slot of
the parachain, then starts authoring
- The search for potential parents remains mostly unchanged from
lookahead
- As anchor, we use the current best relay parent
- In general, the new collator tends to be anchored to one relay parent
earlier. `lookahead` generally waits for a new relay block to arrive
before it attempts to build a block. This means the actual timing of
parachain blocks depends on when the relay block has been authored and
imported. With the slot-triggered approach we are authoring directly on
the slot boundary, were a new relay chain block has probably not yet
arrived.

### Limitations
- Overall, the current implementation focuses on the "happy path"
- We assume that we want to collate close to the tip of the relay chain.
It would be useful however to have some kind of configurable drift, so
that we could lag behind a bit.
paritytech#3965
- The collation task is pretty dumb currently. It checks if we have
cores scheduled and if yes, submits all the messages we have received
from the block builder until we have something submitted for every core.
Ideally we should do some extra checks, i.e. we do not need to submit if
the built block is already too old (build on a out of range relay
parent) or was authored with a relay parent that is not an ancestor of
the relay block we are submitting at.
paritytech#3966
- There is no throttling, we assume that we can submit _velocity_ blocks
every relay chain block. There should be communication between the
collator task and block-builder task.
- The parent search and ConsensusHook are not yet properly adjusted. The
parent search makes assumptions about the pending candidate which no
longer hold. paritytech#3967
- Custom triggers for block building not implemented.

---------

Co-authored-by: Davide Galassi <[email protected]>
Co-authored-by: Andrei Sandu <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Javier Viola <[email protected]>
Co-authored-by: command-bot <>
github-merge-queue bot pushed a commit that referenced this pull request Jul 17, 2024
Resolves #4468

Gives instructions on how to enable elastic scaling MVP to parachain
teams.

Still a draft because it depends on further changes we make to the
slot-based collator:
#4097

Parachains cannot use this yet because the collator was not released and
no relay chain network has been configured for elastic scaling yet
paritytech-ci pushed a commit that referenced this pull request Jul 17, 2024
Resolves #4468

Gives instructions on how to enable elastic scaling MVP to parachain
teams.

Still a draft because it depends on further changes we make to the
slot-based collator:
#4097

Parachains cannot use this yet because the collator was not released and
no relay chain network has been configured for elastic scaling yet
jpserrat pushed a commit to jpserrat/polkadot-sdk that referenced this pull request Jul 18, 2024
Resolves paritytech#4468

Gives instructions on how to enable elastic scaling MVP to parachain
teams.

Still a draft because it depends on further changes we make to the
slot-based collator:
paritytech#4097

Parachains cannot use this yet because the collator was not released and
no relay chain network has been configured for elastic scaling yet
@Polkadot-Forum
Copy link

This pull request has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/elastic-scaling-mvp-launched/9392/1

TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
…ch#4733)

- unit tests for pov-recovery
- elastic scaling support (recovering multiple candidates in a single
relay chain block)
- also some small cleanups
- also switches to candidates_pending_availability in
`handle_empty_block_announce_data`

Fixes paritytech#3577

After paritytech#4097 is merged, we
should also add a zombienet test, similar to the existing
`0002-pov_recovery.toml` but which has a single collator using elastic
scaling on multiple cores.
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
Part of paritytech#3168 
On top of paritytech#3568

### Changes Overview
- Introduces a new collator variant in
`cumulus/client/consensus/aura/src/collators/slot_based/mod.rs`
- Two tasks are part of that module, one for block building and one for
collation building and submission.
- Introduces a new variant of `cumulus-test-runtime` which has 2s slot
duration, used for zombienet testing
- Zombienet tests for the new collator

**Note:** This collator is considered experimental and should only be
used for testing and exploration for now.

### Comparison with `lookahead` collator
- The new variant is slot based, meaning it waits for the next slot of
the parachain, then starts authoring
- The search for potential parents remains mostly unchanged from
lookahead
- As anchor, we use the current best relay parent
- In general, the new collator tends to be anchored to one relay parent
earlier. `lookahead` generally waits for a new relay block to arrive
before it attempts to build a block. This means the actual timing of
parachain blocks depends on when the relay block has been authored and
imported. With the slot-triggered approach we are authoring directly on
the slot boundary, were a new relay chain block has probably not yet
arrived.

### Limitations
- Overall, the current implementation focuses on the "happy path"
- We assume that we want to collate close to the tip of the relay chain.
It would be useful however to have some kind of configurable drift, so
that we could lag behind a bit.
paritytech#3965
- The collation task is pretty dumb currently. It checks if we have
cores scheduled and if yes, submits all the messages we have received
from the block builder until we have something submitted for every core.
Ideally we should do some extra checks, i.e. we do not need to submit if
the built block is already too old (build on a out of range relay
parent) or was authored with a relay parent that is not an ancestor of
the relay block we are submitting at.
paritytech#3966
- There is no throttling, we assume that we can submit _velocity_ blocks
every relay chain block. There should be communication between the
collator task and block-builder task.
- The parent search and ConsensusHook are not yet properly adjusted. The
parent search makes assumptions about the pending candidate which no
longer hold. paritytech#3967
- Custom triggers for block building not implemented.

---------

Co-authored-by: Davide Galassi <[email protected]>
Co-authored-by: Andrei Sandu <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Javier Viola <[email protected]>
Co-authored-by: command-bot <>
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
Resolves paritytech#4468

Gives instructions on how to enable elastic scaling MVP to parachain
teams.

Still a draft because it depends on further changes we make to the
slot-based collator:
paritytech#4097

Parachains cannot use this yet because the collator was not released and
no relay chain network has been configured for elastic scaling yet
magecnion added a commit to freeverseio/laos that referenced this pull request Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T0-node This PR/Issue is related to the topic “node”. T9-cumulus This PR/Issue is related to cumulus.
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

8 participants