Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Introduce trie level cache and remove state cache #11407

Merged
merged 179 commits into from
Aug 18, 2022

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented May 12, 2022

This pr introduces a trie level cache and removes the old state cache. As seen on a lot of parachains the state cache was/is buggy: paritytech/cumulus#474
We looked into it, but couldn't really identify the bug(s). Besides these bugs collator were also building blocks without using any cache: #9770 This means that every state access has gone to the database and this is costly. If you look at the benchmarks below, reading a single key is 10-8x slower than reading from the cache.

So, we started to write a new cache on the trie level. This cache would "automatically" solve the invalidation of old nodes because each node is addressed by its hash, which changes when the content changes. This was the first version of the pull request, a trie node cache. But benchmarking has shown that this trie node cache wasn't enough, the state cache was still faster. The reason for this is that the state cache has in optimum a complexity of O(1) for accessing cached data, while with the trie node cache we still need to walk down the trie which is O(log N). To bring both caches closer to each other, a value cache was introduced. This value cache also providing O(1) in the optimum case. To not run into the same problems as the state cache, the key to this cache is (storage_root, key). Using this key we ensure that we don't accidentally access outdated data, resulting in the problems as seen with the Parachains.

While introducing the trie cache, we also needed to change the way we record storage proofs. Before this pr there wasn't any cache when recording as already explained above. The recorder was between the trie and the database, recording anything that was read from the database. However, that wouldn't work with the cache as it sits "inside the trie" intercepting calls to the database as well. So, the recorder also needed to move there. The recorder also needed to change for when generating the final storage proof. Before we just collected all accessed nodes and put them together into the storage proof. However, now with the value cache we may not even access a trie node to access data in the trie and thus, we can not record a trie node. To solve this we keep track of all keys accessed in this value cache. When the storage proof is generated, we will need to walk down the trie to record all nodes to access these values. However, as we only do this once and the trie nodes are already cached (otherwise the value cache wouldn't have the data) it will not take that much time as always accessing the database as before.

To compare the different caches, there was the state-access benchmark added. With the following results:

Benchmark State cache Trie cache No cache
Reading the entire state (1Mio keys) 372ms 524ms 6.2s
Reading a single key 386ns 561ns 4.4us
Hashing :code 327ns 521ns 4.9us

As we can see the trie cache is around 1.4x slower than the state cache. This is good enough for the first implementation ;) Especially as on testing syncing speed we don't have seen any performance impacts which indicates that it is not IO bound at the moment.

This pr requires the following pr to be merged before and released: paritytech/trie#157

Closes: #9770
Closes: #9769
Closes: paritytech/cumulus#607
Closes: #9320
Closes: #9697

CLI changes

This removes the --state-cache parameter and adds the new --trie-cache-size cli parameter. Parachain operators should now be able to just drop --state-cache 0 and are not required to add the --trie-cache-size as the cache is enabled by default.

polkadot companion: paritytech/polkadot#5897
cumulus companion: paritytech/cumulus#1361

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
The job name - cargo-check-subkey
The job logs - https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1754036

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
The job name - cargo-check-benches
The job logs - https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1754034

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
The job name - cargo-check-benches
The job logs - https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1754542

@bkchr
Copy link
Member Author

bkchr commented Aug 18, 2022

bot merge

@paritytech-processbot
Copy link

Error: Statuses failed for bd972a2

@bkchr
Copy link
Member Author

bkchr commented Aug 18, 2022

bot merge

@paritytech-processbot paritytech-processbot bot merged commit 00cc5f1 into master Aug 18, 2022
@paritytech-processbot paritytech-processbot bot deleted the bkchr-state-access-benchmark branch August 18, 2022 18:59
@koute koute mentioned this pull request Dec 20, 2022
@bkchr bkchr mentioned this pull request Jan 23, 2023
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
* trie state cache

* Also cache missing access on read.

* fix comp

* bis

* fix

* use has_lru

* remove local storage cache on size 0.

* No cache.

* local cache only

* trie cache and local cache

* storage cache (with local)

* trie cache no local cache

* Add state access benchmark

* Remove warnings etc

* Add trie cache benchmark

* No extra "clone" required

* Change benchmark to use multiple blocks

* Use patches

* Integrate shitty implementation

* More stuff

* Revert "Merge branch 'master' into trie_state_cache"

This reverts commit 2c1886d, reversing
changes made to 540b4fd.

* Improve benchmark

* Adapt to latest changes

* Adapt to changes in trie

* Add a test that uses iterator

* Start fixing it

* Remove obsolete file

* Make it compile

* Start rewriting the trie node cache

* More work on the cache

* More docs and code etc

* Make data cache an optional

* Tests

* Remove debug stuff

* Recorder

* Some docs and a simple test for the recorder

* Compile fixes

* Make it compile

* More fixes

* More fixes

* Fix fix fix

* Make sure cache and recorder work together for basic stuff

* Test that data caching and recording works

* Test `TrieDBMut` with caching

* Try something

* Fixes, fixes, fixes

* Forward the recorder

* Make it compile

* Use recorder in more places

* Switch to new `with_optional_recorder` fn

* Refactor and cleanups

* Move `ProvingBackend` tests

* Simplify

* Move over all functionality to the essence

* Fix compilation

* Implement estimate encoded size for StorageProof

* Start using the `cache` everywhere

* Use the cache everywhere

* Fix compilation

* Fix tests

* Adds `TrieBackendBuilder` and enhances the tests

* Ensure that recorder drain checks that values are found as expected

* Switch over to `TrieBackendBuilder`

* Start fixing the problem with child tries and recording

* Fix recording of child tries

* Make it compile

* Overwrite `storage_hash` in `TrieBackend`

* Add `storage_cache` to  the benchmarks

* Fix `no_std` build

* Speed up cache lookup

* Extend the state access benchmark to also hash a runtime

* Fix build

* Fix compilation

* Rewrite value cache

* Add lru cache

* Ensure that the cache lru works

* Value cache should not be optional

* Add support for keeping the shared node cache in its bounds

* Make the cache configurable

* Check that the cache respects the bounds

* Adds a new test

* Fixes

* Docs and some renamings

* More docs

* Start using the new recorder

* Fix more code

* Take `self` argument

* Remove warnings

* Fix benchmark

* Fix accounting

* Rip off the state cache

* Start fixing fallout after removing the state cache

* Make it compile after trie changes

* Fix test

* Add some logging

* Some docs

* Some fixups and clean ups

* Fix benchmark

* Remove unneeded file

* Use git for patching

* Make CI happy

* Update primitives/trie/Cargo.toml

Co-authored-by: Koute <[email protected]>

* Update primitives/state-machine/src/trie_backend.rs

Co-authored-by: cheme <[email protected]>

* Introduce new `AsTrieBackend` trait

* Make the LocalTrieCache not clonable

* Make it work in no_std and add docs

* Remove duplicate dependency

* Switch to ahash for better performance

* Speedup value cache merge

* Output errors on underflow

* Ensure the internal LRU map doesn't grow too much

* Use const fn to calculate the value cache element size

* Remove cache configuration

* Fix

* Clear the cache in between for more testing

* Try to come up with a failing test case

* Make the test fail

* Fix the child trie recording

* Make everything compile after the changes to trie

* Adapt to latest trie-db changes

* Fix on stable

* Update primitives/trie/src/cache.rs

Co-authored-by: cheme <[email protected]>

* Fix wrong merge

* Docs

* Fix warnings

* Cargo.lock

* Bump pin-project

* Fix warnings

* Switch to released crate version

* More fixes

* Make clippy and rustdocs happy

* More clippy

* Print error when using deprecated `--state-cache-size`

* 🤦

* Fixes

* Fix storage_hash linkings

* Update client/rpc/src/dev/mod.rs

Co-authored-by: Arkadiy Paronyan <[email protected]>

* Review feedback

* encode bound

* Rework the shared value cache

Instead of using a `u64` to represent the key we now use an `Arc<[u8]>`. This arc is also stored in
some extra `HashSet`. We store the key are in an extra `HashSet` to de-duplicate the keys accross
different storage roots. When the latest key usage is dropped in the lru, we also remove the key
from the `HashSet`.

* Improve of the cache by merging the old and new solution

* FMT

* Please stop coming back all the time :crying:

* Update primitives/trie/src/cache/shared_cache.rs

Co-authored-by: Arkadiy Paronyan <[email protected]>

* Fixes

* Make clippy happy

* Ensure we don't deadlock

* Only use one lock to simplify the code

* Do not depend on `Hasher`

* Fix tests

* FMT

* Clippy 🤦

Co-authored-by: cheme <[email protected]>
Co-authored-by: Koute <[email protected]>
Co-authored-by: Arkadiy Paronyan <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
None yet
5 participants