Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Performance regression in block import after upgrading to polkadot 0.9.24 #12320

Closed
DylanVerstraete opened this issue Sep 21, 2022 · 13 comments
Closed
Assignees
Labels
J2-unconfirmed Issue might be valid, but it’s not yet known.

Comments

@DylanVerstraete
Copy link

DylanVerstraete commented Sep 21, 2022

We noticed a huge drop in performance when trying to sync an archive node on our existing networks with a node that has polkadot-0.9.24 depedencies. With around 5-8 peers the speed comes down to 25-35 blocks per second.

On substrate V3 the block import speed was around 100-200 blocks per second.

V3 node output:

2022-09-21 15:01:51  ⚙️  Syncing 138.6 bps, target=#2418956 (6 peers), best: #5110 (0x5df0…b1a8), finalized #4608 (0x9d01…9736), ⬇ 109.2kiB/s ⬆ 2.6kiB/s    
2022-09-21 15:01:56  ⚙️  Syncing 135.8 bps, target=#2418957 (6 peers), best: #5789 (0x5a8b…fcfe), finalized #5632 (0xc521…f20f), ⬇ 0.9kiB/s ⬆ 0.3kiB/s    
2022-09-21 15:02:01  ⚙️  Syncing 130.2 bps, target=#2418958 (6 peers), best: #6440 (0xced8…f40e), finalized #6144 (0x77b4…0c34), ⬇ 5.0kiB/s ⬆ 4.4kiB/s    
2022-09-21 15:02:06  ⚙️  Syncing 126.6 bps, target=#2418959 (6 peers), best: #7073 (0xfcb6…5719), finalized #6656 (0xa670…97ed), ⬇ 65.0kiB/s ⬆ 2.9kiB/s    
2022-09-21 15:02:11  ⚙️  Syncing 121.4 bps, target=#2418959 (6 peers), best: #7680 (0x667e…52e0), finalized #7680 (0x667e…52e0), ⬇ 0.4kiB/s ⬆ 0.7kiB/s    
2022-09-21 15:02:16  ⚙️  Syncing 114.6 bps, target=#2418960 (6 peers), best: #8253 (0x6418…49ab), finalized #8192 (0x9c83…c5fd), ⬇ 0.6kiB/s ⬆ 0.2kiB/s    
2022-09-21 15:02:21  ⚙️  Syncing 111.8 bps, target=#2418961 (6 peers), best: #8812 (0x076b…da76), finalized #8704 (0x1676…2515), ⬇ 524.2kiB/s ⬆ 17.8kiB/s    
2022-09-21 15:02:26  ⚙️  Syncing 108.2 bps, target=#2418962 (6 peers), best: #9353 (0x40ac…82e4), finalized #9216 (0xac74…eec9), ⬇ 3.0kiB/s ⬆ 1.4kiB/s    
2022-09-21 15:02:31  ⚙️  Syncing 105.4 bps, target=#2418963 (6 peers), best: #9880 (0xfd60…a549), finalized #9728 (0xc71f…ec9b), ⬇ 4.5kiB/s ⬆ 3.7kiB/s  

Polkadot 0.9.24 output:

2022-09-21 15:21:57 ⚙️  Syncing 23.6 bps, target=#2419157 (6 peers), best: #4913 (0x3ccd…2eb0), finalized #4608 (0x9d01…9736), ⬇ 2.8kiB/s ⬆ 4.0kiB/s    
2022-09-21 15:22:02 ⚙️  Syncing 20.0 bps, target=#2419158 (6 peers), best: #5013 (0xece6…de66), finalized #4608 (0x9d01…9736), ⬇ 2.4kiB/s ⬆ 2.3kiB/s    
2022-09-21 15:22:07 ⚙️  Syncing 26.0 bps, target=#2419159 (6 peers), best: #5143 (0x1dc8…8b3a), finalized #5120 (0x766f…661f), ⬇ 1.5kiB/s ⬆ 1.8kiB/s    
2022-09-21 15:22:12 ⚙️  Syncing 27.6 bps, target=#2419160 (6 peers), best: #5281 (0x457a…d010), finalized #5120 (0x766f…661f), ⬇ 3.4kiB/s ⬆ 4.9kiB/s    
2022-09-21 15:22:17 ⚙️  Syncing 31.2 bps, target=#2419160 (6 peers), best: #5437 (0xa503…c895), finalized #5120 (0x766f…661f), ⬇ 33.1kiB/s ⬆ 3.4kiB/s    
2022-09-21 15:22:22 ⚙️  Syncing 29.2 bps, target=#2419161 (6 peers), best: #5583 (0x444d…f831), finalized #5120 (0x766f…661f), ⬇ 2.7kiB/s ⬆ 1.9kiB/s    
2022-09-21 15:22:27 ⚙️  Syncing 27.4 bps, target=#2419162 (6 peers), best: #5720 (0xaa40…1234), finalized #5632 (0xc521…f20f), ⬇ 2.7kiB/s ⬆ 2.6kiB/s    
2022-09-21 15:22:32 ⚙️  Syncing 27.6 bps, target=#2419163 (6 peers), best: #5858 (0xc038…9b78), finalized #5632 (0xc521…f20f), ⬇ 2.4kiB/s ⬆ 2.0kiB/s    
2022-09-21 15:22:37 ⚙️  Syncing 26.6 bps, target=#2419164 (6 peers), best: #5991 (0xf9ff…f28d), finalized #5632 (0xc521…f20f), ⬇ 41.1kiB/s ⬆ 4.0kiB/s    
@github-actions github-actions bot added the J2-unconfirmed Issue might be valid, but it’s not yet known. label Sep 21, 2022
@DylanVerstraete
Copy link
Author

After making our node compatible with polkadot-0.9.29 depedencies, block import speed improved by 50% to 45-60 blocks per second

2022-09-21 17:33:42 ⚙️  Syncing 60.2 bps, target=#2420475 (4 peers), best: #2751 (0xdb9e…6a7e), finalized #2560 (0xb0c5…1cab), ⬇ 14.9kiB/s ⬆ 1.8kiB/s    
2022-09-21 17:33:47 ⚙️  Syncing 59.8 bps, target=#2420475 (4 peers), best: #3050 (0xb616…665b), finalized #2560 (0xb0c5…1cab), ⬇ 24.0kiB/s ⬆ 2.4kiB/s    
2022-09-21 17:33:52 ⚙️  Syncing 50.8 bps, target=#2420476 (4 peers), best: #3304 (0x26d3…23cb), finalized #3072 (0x72cf…d65f), ⬇ 2.4kiB/s ⬆ 1.7kiB/s    
2022-09-21 17:33:57 ⚙️  Syncing 53.4 bps, target=#2420477 (4 peers), best: #3571 (0x64bd…7724), finalized #3072 (0x72cf…d65f), ⬇ 16.7kiB/s ⬆ 0.5kiB/s    
2022-09-21 17:34:02 ⚙️  Syncing 51.0 bps, target=#2420478 (6 peers), best: #3826 (0x9df4…54ae), finalized #3584 (0x7ae9…5c86), ⬇ 55.5kiB/s ⬆ 13.1kiB/s    
2022-09-21 17:34:07 ⚙️  Syncing 53.6 bps, target=#2420479 (6 peers), best: #4094 (0x06ad…a051), finalized #3584 (0x7ae9…5c86), ⬇ 0.6kiB/s ⬆ 0.3kiB/s    
2022-09-21 17:34:12 ⚙️  Syncing 55.0 bps, target=#2420480 (6 peers), best: #4369 (0xb180…3e0e), finalized #4096 (0x23ba…8ae8), ⬇ 28.2kiB/s ⬆ 2.0kiB/s    
2022-09-21 17:34:17 ⚙️  Syncing 57.0 bps, target=#2420480 (6 peers), best: #4654 (0xcc55…301d), finalized #4608 (0x9d01…9736), ⬇ 1.3kiB/s ⬆ 1.7kiB/s    
2022-09-21 17:34:22 ⚙️  Syncing 56.4 bps, target=#2420481 (6 peers), best: #4936 (0x9d08…5e97), finalized #4608 (0x9d01…9736), ⬇ 34.0kiB/s ⬆ 1.9kiB/s    

@DylanVerstraete
Copy link
Author

DylanVerstraete commented Sep 22, 2022

@bkchr
Copy link
Member

bkchr commented Sep 22, 2022

On substrate V3 the block import speed was around 100-200 blocks per second.

Can you give an exact commit? And can you also link to the different versions of your node you are speaking about?

@DylanVerstraete
Copy link
Author

DylanVerstraete commented Sep 22, 2022

Can you give an exact commit? And can you also link to the different versions of your node you are speaking about?

Substrate V3 node: https://github.com/threefoldtech/tfchain/tree/fix_devnet/substrate-node (compile with cargo +nightly-2021-06-09 build --release)
Substrate polkadot-0.9.24 node: https://github.com/threefoldtech/tfchain/tree/2.1.0/substrate-node

@bkchr
Copy link
Member

bkchr commented Sep 22, 2022

And your V3 wasn't just faster because you used the native runtime?

@DylanVerstraete
Copy link
Author

No I don't think so.

@bkchr
Copy link
Member

bkchr commented Sep 22, 2022

Okay. @koute maybe something you could take a look at some point? The regression seems to be relative high.

@bkchr
Copy link
Member

bkchr commented Sep 22, 2022

@DylanVerstraete it would be nice to post the exact commands you used to launch your node.

@DylanVerstraete
Copy link
Author

@DylanVerstraete it would be nice to post the exact commands you used to launch your node.

./target/release/tfchain --chain chainspecs/dev/chainSpecRaw.json --pruning=archive -d /tmp/dev

@koute
Copy link
Contributor

koute commented Sep 23, 2022

Sure, I can take a look at this.

@koute
Copy link
Contributor

koute commented Oct 13, 2022

@DylanVerstraete Here's a patch which will speed up your blocks/sec when syncing by 10x from ~50 bps to ~500 bps:

diff --git a/substrate-node/node/Cargo.toml b/substrate-node/node/Cargo.toml
index f3ebc12..1fac895 100644
--- a/substrate-node/node/Cargo.toml
+++ b/substrate-node/node/Cargo.toml
@@ -37,7 +37,7 @@ frame-benchmarking-cli = {git = "https://github.com/paritytech/substrate.git", b
 pallet-transaction-payment-rpc = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
 pallet-transaction-payment = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
 sc-basic-authorship = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
-sc-cli = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
+sc-cli = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24", features = ["wasmtime"]}
 sc-client-api = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
 sc-executor = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}
 sc-service = {git = "https://github.com/paritytech/substrate", branch = "polkadot-v0.9.24"}

Basically, you were using the interpreted WASM executor. If you switch to the compiled executor (which is the default, as long as it's enabled at compile time) you can not only regain all of that lost performance but also significantly speed up your syncing compared to what it was.

It is true that there is a performance regression when running with the interpreted executor; compared to the old node (which is using wasmi 0.6.2) the new node (which is using wasmi 0.9.1) is spending the majority of its time (~82%) erasing the linear memory with memset (which the old node did not). There's not much point in investigating this any further as even the "newer" version of wasmi is over a year old at this point, and you're not really supposed to run your chain on the interpreted executor anymore anyway. (This reminds me, we should make the wasmtime-based executor enabled by default; there's no point in it being disabled by default anymore. I'll whip up a PR to prevent these kinds of cases in the future.)

@DylanVerstraete
Copy link
Author

@koute verified, thank!

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-release-analysis/1026/2

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J2-unconfirmed Issue might be valid, but it’s not yet known.
Projects
None yet
Development

No branches or pull requests

4 participants