[1 / 5] Optimize logic for gossiping assignments #4848

alexggh · 2024-06-20T11:57:57Z

This is part of the work to further optimize the approval subsystems, if you want to understand the full context start with reading #4849 (comment), however that's not necessary, as this change is self-contained and nodes would benefit from it regardless of subsequent changes landing or not.

While testing with 1000 validators I found out that the logic for determining the validators an assignment should be gossiped to is taking a lot of time, because it always iterated through all the peers, to determine which are X and Y neighbours and to which we should randomly gossip(4 samples).

This could be actually optimised, so we don't have to iterate through all peers for each new assignment, by fetching the list of X and Y peer ids from the topology first and then stopping the loop once we took the 4 random samples.

With this improvements we reduce the total CPU time spent in approval-distribution with 15% on networks with 500 validators and 20% on networks with 1000 validators.

Test coverage:

propagates_assignments_along_unshared_dimension and propagates_locally_generated_assignment_to_both_dimensions cover already logic and they passed, confirm that there is no breaking change.

Additionally, the approval voting benchmark measure the traffic sent to other peers, so I confirmed that for various network size there is no difference in the size of the traffic sent to other peers.

Signed-off-by: Alexandru Gheorghe <[email protected]>

paritytech-cicd-pr · 2024-07-02T12:51:37Z

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable 2/3
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6605079

alindima

Nice!

ordian

great find!

sandreim

Nice work!

AndreiEres

Great job!

polkadot/node/network/protocol/src/grid_topology.rs

Signed-off-by: Alexandru Gheorghe <[email protected]>

alexggh · 2024-07-16T08:18:17Z

Double checked that things are working as expected on versi as well. Merging it now.

* master: add elastic scaling MVP guide (#4663) Send PeerViewChange with high priority (#4755) [ci] Update forklift in CI image (#5032) Adjust base value for statement-distribution regression tests (#5028) [pallet_contracts] Add support for transient storage in contracts host functions (#4566) [1 / 5] Optimize logic for gossiping assignments (#4848) Remove `pallet-getter` usage from pallet-session (#4972) command-action: added scoped permissions to the github tokens (#5016) net/litep2p: Propagate ValuePut events to the network backend (#5018) rpc: add back rpc logger (#4952) Updated substrate-relay version for tests (#5017) Remove most all usage of `sp-std` (#5010) Use sp_runtime::traits::BadOrigin (#5011)

This is part of the work to further optimize the approval subsystems, if you want to understand the full context start with reading paritytech#4849 (comment), however that's not necessary, as this change is self-contained and nodes would benefit from it regardless of subsequent changes landing or not. While testing with 1000 validators I found out that the logic for determining the validators an assignment should be gossiped to is taking a lot of time, because it always iterated through all the peers, to determine which are X and Y neighbours and to which we should randomly gossip(4 samples). This could be actually optimised, so we don't have to iterate through all peers for each new assignment, by fetching the list of X and Y peer ids from the topology first and then stopping the loop once we took the 4 random samples. With this improvements we reduce the total CPU time spent in approval-distribution with 15% on networks with 500 validators and 20% on networks with 1000 validators. ## Test coverage: `propagates_assignments_along_unshared_dimension` and `propagates_locally_generated_assignment_to_both_dimensions` cover already logic and they passed, confirm that there is no breaking change. Additionally, the approval voting benchmark measure the traffic sent to other peers, so I confirmed that for various network size there is no difference in the size of the traffic sent to other peers. --------- Signed-off-by: Alexandru Gheorghe <[email protected]>

* master: (125 commits) add elastic scaling MVP guide (#4663) Send PeerViewChange with high priority (#4755) [ci] Update forklift in CI image (#5032) Adjust base value for statement-distribution regression tests (#5028) [pallet_contracts] Add support for transient storage in contracts host functions (#4566) [1 / 5] Optimize logic for gossiping assignments (#4848) Remove `pallet-getter` usage from pallet-session (#4972) command-action: added scoped permissions to the github tokens (#5016) net/litep2p: Propagate ValuePut events to the network backend (#5018) rpc: add back rpc logger (#4952) Updated substrate-relay version for tests (#5017) Remove most all usage of `sp-std` (#5010) Use sp_runtime::traits::BadOrigin (#5011) network/tx: Ban peers with tx that fail to decode (#5002) Try State Hook for Bounties (#4563) [statement-distribution] Add metrics for distributed statements in V2 (#4554) added sync command (#4818) Bridges V2 refactoring backport and `pallet_bridge_messages` simplifications (#4935) xcm-executor: Improve logging (#4996) Remove usage of `sp-std` on templates (#5001) ...

This is part of the work to further optimize the approval subsystems, if you want to understand the full context start with reading paritytech#4849 (comment), however that's not necessary, as this change is self-contained and nodes would benefit from it regardless of subsequent changes landing or not. While testing with 1000 validators I found out that the logic for determining the validators an assignment should be gossiped to is taking a lot of time, because it always iterated through all the peers, to determine which are X and Y neighbours and to which we should randomly gossip(4 samples). This could be actually optimised, so we don't have to iterate through all peers for each new assignment, by fetching the list of X and Y peer ids from the topology first and then stopping the loop once we took the 4 random samples. With this improvements we reduce the total CPU time spent in approval-distribution with 15% on networks with 500 validators and 20% on networks with 1000 validators. ## Test coverage: `propagates_assignments_along_unshared_dimension` and `propagates_locally_generated_assignment_to_both_dimensions` cover already logic and they passed, confirm that there is no breaking change. Additionally, the approval voting benchmark measure the traffic sent to other peers, so I confirmed that for various network size there is no difference in the size of the traffic sent to other peers. --------- Signed-off-by: Alexandru Gheorghe <[email protected]>

* master: (130 commits) add elastic scaling MVP guide (#4663) Send PeerViewChange with high priority (#4755) [ci] Update forklift in CI image (#5032) Adjust base value for statement-distribution regression tests (#5028) [pallet_contracts] Add support for transient storage in contracts host functions (#4566) [1 / 5] Optimize logic for gossiping assignments (#4848) Remove `pallet-getter` usage from pallet-session (#4972) command-action: added scoped permissions to the github tokens (#5016) net/litep2p: Propagate ValuePut events to the network backend (#5018) rpc: add back rpc logger (#4952) Updated substrate-relay version for tests (#5017) Remove most all usage of `sp-std` (#5010) Use sp_runtime::traits::BadOrigin (#5011) network/tx: Ban peers with tx that fail to decode (#5002) Try State Hook for Bounties (#4563) [statement-distribution] Add metrics for distributed statements in V2 (#4554) added sync command (#4818) Bridges V2 refactoring backport and `pallet_bridge_messages` simplifications (#4935) xcm-executor: Improve logging (#4996) Remove usage of `sp-std` on templates (#5001) ...

@ordian

This is the implementation of the approach described here: #1617 (comment) & #1617 (comment) & #1617 (comment). ## Description of changes The end goal is to have an architecture where we have single subsystem(`approval-voting-parallel`) and multiple worker types that would full-fill the work that currently is fulfilled by the `approval-distribution` and `approval-voting` subsystems. The main loop of the new subsystem would do just the distribution of work to the workers. The new subsystem will have: - N approval-distribution workers: This would do the work that is currently being done by the approval-distribution subsystem and in addition to that will also perform the crypto-checks that an assignment is valid and that a vote is correctly signed. Work is assigned via the following formula: `worker_index = msg.validator % WORKER_COUNT`, this guarantees that all assignments and approvals from the same validator reach the same worker. - 1 approval-voting worker: This would receive an already valid message and do everything the approval-voting currently does, except the crypto-checking that has been moved already to the approval-distribution worker. On the hot path of processing messages **no** synchronisation and waiting is needed between approval-distribution and approval-voting workers. <img width="1431" alt="Screenshot 2024-06-07 at 11 28 08" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/a196199b-b705-4140-87d4-c6900ba8595e"> ## Guidelines for reading The full implementation is broken in 5 PRs and all of them are self-contained and improve things incrementally even without the parallelisation being implemented/enabled, the reason this approach was taken instead of a big-bang PR, is to make things easier to review and reduced the risk of breaking this critical subsystems. After reading the full description of this PR, the changes should be read in the following order: 1. #4848, some other micro-optimizations for networks with a high number of validators. This change gives us a speed up by itself without any other changes. 2. #4845 , this contains only interface changes to decouple the subsystem from the `Context` and be able to run multiple instances of the subsystem on different threads. **No functional changes** 3. #4928, moving of the crypto checks from approval-voting in approval-distribution, so that the approval-distribution has no reason to wait after approval-voting anymore. This change gives us a speed up by itself without any other changes. 4. #4846, interface changes to make approval-voting runnable on a separate thread. **No functional changes** 5. This PR, where we instantiate an `approval-voting-parallel` subsystem that runs on different workers the logic currently in `approval-distribution` and `approval-voting`. 6. The next step after this changes get merged and deploy would be to bring all the files from approval-distribution, approval-voting, approval-voting-parallel into a single rust crate, to make it easier to maintain and understand the structure. ## Results Running subsystem-benchmarks with 1000 validators 100 fully ocuppied cores and triggering all assignments and approvals for all tranches #### Approval does not lags behind. Master ``` Chain selection approved after 72500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a ``` With this PoC ``` Chain selection approved after 3500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a ``` #### Gathering enough assignments Enough assignments are gathered in less than 500ms, so that gives un a guarantee that un-necessary work does not get triggered, on master on the same benchmark because the subsystems fall behind on work, that number goes above 32 seconds on master. <img width="2240" alt="Screenshot 2024-06-20 at 15 48 22" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/d2f2b29c-5ff6-44b4-a245-5b37ab8e58bc"> #### Cpu usage: Master ``` CPU usage, seconds total per block approval-distribution 96.9436 9.6944 approval-voting 117.4676 11.7468 test-environment 44.0092 4.4009 ``` With this PoC ``` CPU usage, seconds total per block approval-distribution 0.0014 0.0001 --- unused approval-voting 0.0437 0.0044. --- unused approval-voting-parallel 5.9560 0.5956 approval-voting-parallel-0 22.9073 2.2907 approval-voting-parallel-1 23.0417 2.3042 approval-voting-parallel-2 22.0445 2.2045 approval-voting-parallel-3 22.7234 2.2723 approval-voting-parallel-4 21.9788 2.1979 approval-voting-parallel-5 23.0601 2.3060 approval-voting-parallel-6 22.4805 2.2481 approval-voting-parallel-7 21.8330 2.1833 approval-voting-parallel-db 37.1954 3.7195. --- the approval-voting thread. ``` # Enablement strategy Because just some trivial plumbing is needed in approval-distribution and approval-voting to be able to run things in parallel and because this subsystems plays a critical part in the system this PR proposes that we keep both ways of running the approval work, as separated subsystems and just a single subsystem(`approval-voting-parallel`) which has multiple workers for the distribution work and one worker for the approval-voting work and switch between them with a comandline flag. The benefits for this is twofold. 1. With the same polkadot binary we can easily switch just a few validators to use the parallel approach and gradually make this the default way of running, if now issues arise. 2. In the worst case scenario were it becomes the default way of running things, but we discover there are critical issues with it we have the path to quickly disable it by asking validators to adjust their command line flags. # Next steps - [x] Make sure through various testing we are not missing anything - [x] Polish the implementations to make them production ready - [x] Add Unittest Tests for approval-voting-parallel. - [x] Define and implement the strategy for rolling this change, so that the blast radius is minimal(single validator) in case there are problems with the implementation. - [x] Versi long running tests. - [x] Add relevant metrics. @ordian @eskimor @sandreim @AndreiEres, let me know what you think. --------- Signed-off-by: Alexandru Gheorghe <[email protected]>

alexggh mentioned this pull request Jun 20, 2024

[5 / 5] Introduce approval-voting-parallel #4849

Merged

6 tasks

alexggh changed the title ~~Optimize logic for gossiping assignments~~ [4 / 5] Optimize logic for gossiping assignments Jun 20, 2024

alexggh changed the base branch from alexaggh/approval-voting-parallel-3-5 to master July 2, 2024 11:39

alexggh force-pushed the alexaggh/approval-voting-parallel-4-5 branch from cb57906 to 4b3f489 Compare July 2, 2024 11:40

alexggh changed the title ~~[4 / 5] Optimize logic for gossiping assignments~~ [1 / 5] Optimize logic for gossiping assignments Jul 2, 2024

alexggh force-pushed the alexaggh/approval-voting-parallel-4-5 branch 2 times, most recently from 7add070 to 4b3f489 Compare July 2, 2024 12:01

alexggh marked this pull request as ready for review July 2, 2024 12:24

alexggh added the T0-node This PR/Issue is related to the topic “node”. label Jul 2, 2024

Optimize logic for gossiping assignments

713aef4

Signed-off-by: Alexandru Gheorghe <[email protected]>

alexggh force-pushed the alexaggh/approval-voting-parallel-4-5 branch from 4b3f489 to 713aef4 Compare July 2, 2024 12:29

alindima approved these changes Jul 9, 2024

View reviewed changes

ordian approved these changes Jul 9, 2024

View reviewed changes

sandreim approved these changes Jul 9, 2024

View reviewed changes

AndreiEres reviewed Jul 10, 2024

View reviewed changes

polkadot/node/network/protocol/src/grid_topology.rs Show resolved Hide resolved

AndreiEres approved these changes Jul 10, 2024

View reviewed changes

Add prdoc

e6f8b69

Signed-off-by: Alexandru Gheorghe <[email protected]>

alexggh enabled auto-merge July 16, 2024 08:18

Merge branch 'master' into alexaggh/approval-voting-parallel-4-5

22e3aff

alexggh added this pull request to the merge queue Jul 16, 2024

Merged via the queue into master with commit cde2eb4 Jul 16, 2024
155 of 160 checks passed

alexggh deleted the alexaggh/approval-voting-parallel-4-5 branch July 16, 2024 10:16

This was referenced Aug 21, 2024

Update polkadot-sdk from v1.11.0 to stable2407 moondance-labs/tanssi#659

Open

Update polkadot-sdk from v1.11.0 to stable2407 moonbeam-foundation/moonbeam#2912

Closed

TDemeco mentioned this pull request Oct 7, 2024

feat: ⏫ upgrade to Polkadot SDK stable2407 Moonsong-Labs/storage-hub#222

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1 / 5] Optimize logic for gossiping assignments #4848

[1 / 5] Optimize logic for gossiping assignments #4848

alexggh commented Jun 20, 2024 •

edited

Loading

paritytech-cicd-pr commented Jul 2, 2024

alindima left a comment

ordian left a comment

sandreim left a comment

AndreiEres left a comment

alexggh commented Jul 16, 2024

[1 / 5] Optimize logic for gossiping assignments #4848

[1 / 5] Optimize logic for gossiping assignments #4848

Conversation

alexggh commented Jun 20, 2024 • edited Loading

Test coverage:

paritytech-cicd-pr commented Jul 2, 2024

alindima left a comment

Choose a reason for hiding this comment

ordian left a comment

Choose a reason for hiding this comment

sandreim left a comment

Choose a reason for hiding this comment

AndreiEres left a comment

Choose a reason for hiding this comment

alexggh commented Jul 16, 2024

alexggh commented Jun 20, 2024 •

edited

Loading