availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets #4399

sandreim · 2024-05-07T11:39:16Z

Doing this change ensures that we minimize the CPU usage we spend in reed-solomon by only doing the re-encoding into chunks if PoV size is less than 4MB (which means all PoVs right now)

Based on susbystem benchmark results we concluded that it is safe to bump this number higher. At worst case scenario the network pressure for a backing group of 5 is around 25% of the network bandwidth in hw specs.

Assuming 6s block times (max_candidate_depth 3) and needed_approvals 30 the amount of bandwidth usage of a backing group used would hover above 30 * 4 * 3 = 360MB per relay chain block. Given a backing group of 5 that gives 72MB per block per validator -> 12 MB/s.

Reality check on Kusama PoV sizes (click for chart)

Signed-off-by: Andrei Sandu <[email protected]>

ordian · 2024-05-08T11:43:53Z

Based on susbystem benchmark results we concluded that it is safe to bump this number higher. At worst case scenario the network pressure for a backing group of 5 is around 25% of the network bandwidth in hw specs.

Currently, the network up(load) requirements are quite low in this regard and making everyone download from backers in all the cases would change that quite a bit (+160MBit/s with backing group size of 3 assuming all 3 have the PoV and not just 2). I would also like to see this tested in the presence of disputes, when every paravalidator needs to download the PoV, not just 30.

To be on the safer side, would it be possible to gate this change to be Kusama only? Or start with a lower limit, e.g. 1MB.

sandreim · 2024-05-08T11:54:52Z

Based on susbystem benchmark results we concluded that it is safe to bump this number higher. At worst case scenario the network pressure for a backing group of 5 is around 25% of the network bandwidth in hw specs.

Currently, the network up(load) requirements are quite low in this regard and making everyone download from backers

Is this asymmetric upload/download specs documented anywhere. The specs I am looking at say:

The minimum symmetric networking speed is set to 500 Mbit/s (= 62.5 MB/s). This is required to support a large number of parachains and allow for proper congestion control in busy network situations.

in all the cases would change that quite a bit (+160MBit/s with backing group size of 3 assuming all 3 have the PoV and not just 2). I would also like to see this tested in the presence of disputes, when every paravalidator needs to download the PoV, not just 30.

Yeah, with a backing group of 3, the load in worst case is 50% of this network bandwidth. However in the case of disputes the backers would be hammered, so the download of the PoV will fallback to chunks so we should be fine as this is an exceptional situation. We can definitely try this scenario with subsystem benchmarks, or are you suggesting we do a Versi test ?

To be on the safer side, would it be possible to gate this change to be Kusama only? Or start with a lower limit, e.g. 1MB.

This is doable per chain. I'd go for that rather than 1MB.

burdges · 2024-05-08T12:23:11Z

However in the case of disputes the backers would be hammered

Yes, we're worried about variance here, since the averages stay the same. We'll presumably need the tit-for-tat game in availability rewards eventually, since you'd save lots by just not helping others check.

…reim/bump_chunks_fetch_threshold

Signed-off-by: Andrei Sandu <[email protected]>

sandreim · 2024-05-24T10:07:09Z

Switched to 1MB ok Polkadot and 4MB on Kusama + all testnets. @ordian PTAL.

paritytech-cicd-pr · 2024-05-24T10:11:19Z

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6281150

Signed-off-by: Andrei Sandu <[email protected]>

alindima

Looks reasonable! Don't forget to update the PR description and title (will be used for commit message and they're outdated)

… and 4MB for Kusama + testnets (paritytech#4399) Doing this change ensures that we minimize the CPU usage we spend in reed-solomon by only doing the re-encoding into chunks if PoV size is less than 4MB (which means all PoVs right now) Based on susbystem benchmark results we concluded that it is safe to bump this number higher. At worst case scenario the network pressure for a backing group of 5 is around 25% of the network bandwidth in hw specs. Assuming 6s block times (max_candidate_depth 3) and needed_approvals 30 the amount of bandwidth usage of a backing group used would hover above `30 * 4 * 3 = 360MB` per relay chain block. Given a backing group of 5 that gives 72MB per block per validator -> 12 MB/s. <details> <summary>Reality check on Kusama PoV sizes (click for chart)</summary> <br> <img width="697" alt="Screenshot 2024-05-07 at 14 30 38" src="https://github.com/paritytech/polkadot-sdk/assets/54316454/bfed32d4-8623-48b0-9ec0-8b95dd2a9d8c"> </details> --------- Signed-off-by: Andrei Sandu <[email protected]>

bump to 4MB

ba8562b

Signed-off-by: Andrei Sandu <[email protected]>

sandreim added R0-silent Changes should not be mentioned in any release notes T0-node This PR/Issue is related to the topic “node”. labels May 7, 2024

sandreim added 2 commits May 24, 2024 12:41

Merge branch 'master' of github.com:paritytech/polkadot-sdk into sand…

91e623e

…reim/bump_chunks_fetch_threshold

Use 1MB on polkadot, 4MB on Kusama and all testnets

8f76ff3

Signed-off-by: Andrei Sandu <[email protected]>

sandreim requested review from ordian and alindima May 24, 2024 10:07

clippy

2524164

Signed-off-by: Andrei Sandu <[email protected]>

alindima approved these changes May 24, 2024

View reviewed changes

sandreim changed the title ~~availability-recovery: bump chunk fetch threshold to 4MB~~ availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets May 24, 2024

ordian approved these changes May 24, 2024

View reviewed changes

sandreim added this pull request to the merge queue May 24, 2024

Merged via the queue into master with commit f469fbf May 24, 2024
149 of 152 checks passed

sandreim deleted the sandreim/bump_chunks_fetch_threshold branch May 24, 2024 14:40

sandreim mentioned this pull request Aug 13, 2024

Increase max_pov_size to 10MB #5334

Open

This was referenced Aug 21, 2024

Update polkadot-sdk from v1.11.0 to stable2407 moondance-labs/tanssi#659

Open

Update polkadot-sdk from v1.11.0 to stable2407 moonbeam-foundation/moonbeam#2912

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets #4399

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets #4399

sandreim commented May 7, 2024 •

edited

Loading

ordian commented May 8, 2024

sandreim commented May 8, 2024 •

edited

Loading

burdges commented May 8, 2024

sandreim commented May 24, 2024

paritytech-cicd-pr commented May 24, 2024

alindima left a comment

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets #4399

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets #4399

Conversation

sandreim commented May 7, 2024 • edited Loading

ordian commented May 8, 2024

sandreim commented May 8, 2024 • edited Loading

burdges commented May 8, 2024

sandreim commented May 24, 2024

paritytech-cicd-pr commented May 24, 2024

alindima left a comment

Choose a reason for hiding this comment

sandreim commented May 7, 2024 •

edited

Loading

sandreim commented May 8, 2024 •

edited

Loading