Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append overlay optimization. #1223

Merged
merged 59 commits into from
Jun 11, 2024
Merged

Append overlay optimization. #1223

merged 59 commits into from
Jun 11, 2024

Conversation

cheme
Copy link
Contributor

@cheme cheme commented Aug 28, 2023

This branch propose to avoid clones in append by storing offset and size in previous overlay depth.
That way on rollback we can just truncate and change size of existing value.
To avoid copy it also means that :

  • append on new overlay layer if there is an existing value: create a new Append entry with previous offsets, and take memory of previous overlay value.
  • rollback on append: restore value by applying offsets and put it back in previous overlay value
  • commit on append: appended value overwrite previous value (is an empty vec as the memory was taken). offsets of commited layer are dropped, if there is offset in previous overlay layer they are maintained.
  • set value (or remove) when append offsets are present: current appended value is moved back to previous overlay value with offset applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor differents directions (transients blob storage for instance), as the current append mechanism is a bit tricky (having a variable length in first position means we sometime need to insert in front of a vector).

Fix #30.

@paritytech-ci paritytech-ci requested review from a team August 28, 2023 15:28
@cheme cheme added A0-needs_burnin Pull request needs to be tested on a live validator node before merge. DevOps is notified via matrix T0-node This PR/Issue is related to the topic “node”. I2-bug The node fails to follow expected behavior. labels Aug 28, 2023
@cheme cheme requested a review from koute as a code owner November 9, 2023 07:55
@cheme cheme assigned bkchr and unassigned bkchr Jan 15, 2024
@cheme cheme requested a review from a team January 15, 2024 15:22
@cheme
Copy link
Contributor Author

cheme commented Jan 15, 2024

@ core devs, this pr needs reviewers, anyone familiar with the commit overlay internals or wanted to be are very welcome.

substrate/primitives/state-machine/src/basic.rs Outdated Show resolved Hide resolved
substrate/primitives/state-machine/src/basic.rs Outdated Show resolved Hide resolved
substrate/primitives/state-machine/src/basic.rs Outdated Show resolved Hide resolved
substrate/primitives/state-machine/src/lib.rs Outdated Show resolved Hide resolved
@bkchr
Copy link
Member

bkchr commented Jun 3, 2024

@ggwpez @skunert this is now ready for another pass by you. Then we can merge this.

@skunert skunert self-requested a review June 4, 2024 07:57
Copy link
Member

@ggwpez ggwpez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is a bit shallow, but the fuzzer seems good.


# Prevent this from interfering with workspaces
[workspace]
members = ["."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interfere with it? Do we not want to run it from the top level?

substrate/primitives/state-machine/src/fuzzing.rs Outdated Show resolved Hide resolved
/// If `None`, than `data` is not yet prefixed with the length.
materialized_length: Option<u32>,
/// The size of `data` in the parent transactional layer.
parent_size: Option<usize>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this none if there is no parent layer or if the values does not exist in the parent value?

}
self.transaction_depth -= 1;
self.reference.commit_transaction();
ext.storage_commit_transaction().unwrap();
Copy link
Member

@ggwpez ggwpez Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the *_offchain functions did not really change, but maybe still good to also test them?

Copy link
Contributor

@skunert skunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked the logic, looks good.

@bkchr bkchr enabled auto-merge June 11, 2024 21:03
@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6443753

@bkchr bkchr added this pull request to the merge queue Jun 11, 2024
Merged via the queue into paritytech:master with commit ad86209 Jun 11, 2024
153 of 157 checks passed
bkchr added a commit that referenced this pull request Jun 12, 2024
This branch propose to avoid clones in append by storing offset and size
in previous overlay depth.
That way on rollback we can just truncate and change size of existing
value.
To avoid copy it also means that :

- append on new overlay layer if there is an existing value: create a
new Append entry with previous offsets, and take memory of previous
overlay value.
- rollback on append: restore value by applying offsets and put it back
in previous overlay value
- commit on append: appended value overwrite previous value (is an empty
vec as the memory was taken). offsets of commited layer are dropped, if
there is offset in previous overlay layer they are maintained.
- set value (or remove) when append offsets are present: current
appended value is moved back to previous overlay value with offset
applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of
duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor
differents directions (transients blob storage for instance), as the
current append mechanism is a bit tricky (having a variable length in
first position means we sometime need to insert in front of a vector).

Fix #30.

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: EgorPopelyaev <[email protected]>
Co-authored-by: Alexandru Vasile <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Oliver Tale-Yazdi <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: Liam Aharon <[email protected]>
Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: Branislav Kontur <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
eskimor pushed a commit that referenced this pull request Jun 12, 2024
This branch propose to avoid clones in append by storing offset and size
in previous overlay depth.
That way on rollback we can just truncate and change size of existing
value.
To avoid copy it also means that :

- append on new overlay layer if there is an existing value: create a
new Append entry with previous offsets, and take memory of previous
overlay value.
- rollback on append: restore value by applying offsets and put it back
in previous overlay value
- commit on append: appended value overwrite previous value (is an empty
vec as the memory was taken). offsets of commited layer are dropped, if
there is offset in previous overlay layer they are maintained.
- set value (or remove) when append offsets are present: current
appended value is moved back to previous overlay value with offset
applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of
duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor
differents directions (transients blob storage for instance), as the
current append mechanism is a bit tricky (having a variable length in
first position means we sometime need to insert in front of a vector).

Fix #30.

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: EgorPopelyaev <[email protected]>
Co-authored-by: Alexandru Vasile <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Oliver Tale-Yazdi <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: Liam Aharon <[email protected]>
Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: Branislav Kontur <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
@kianenigma
Copy link
Contributor

Would any of the approvers or the new/old author be able to provide a TLDR if this has any ramifications in the FRAME code that we should be aware of? Is there anything that was previously an anti-pattern, and is now possible? The original PR description seems dated to me.

@cheme
Copy link
Contributor Author

cheme commented Jun 13, 2024

From where the PR was last time I touch it (I don't think there was design changes): For a given key:

  • never mix append and insert operations
  • append only operation (no read) got linear cost
  • read can have an associated cost (producing a vec from all the appended element). IIRC there is a corner case where materializing read that involves shifting bytes due to resizing the scale encoded size. Apart from that read should be rather fine (for instance moving the encoded number of element at the end of the entry could fix it, thinking about it could also use a vecdeque but this is still a corner case).

@ggwpez
Copy link
Member

ggwpez commented Jun 13, 2024

Would any of the approvers or the new/old author be able to provide a TLDR if this has any ramifications in the FRAME code that we should be aware of? Is there anything that was previously an anti-pattern, and is now possible? The original PR description seems dated to me.

I think not in general. Having large vectors inside StorageValues is already an anti-pattern. We now just made that quicker in the append+revert case.

ordian added a commit that referenced this pull request Jun 13, 2024
* master: (29 commits)
  Append overlay optimization. (#1223)
  finalization: Skip tree route calculation if no forks present (#4721)
  Remove unncessary call remove_from_peers_set (#4742)
  add pov-recovery unit tests and support for elastic scaling (#4733)
  approval-voting: Add no shows debug information (#4726)
  Revamp the Readme of the parachain template (#4713)
  Update README.md to move the PSVM link under a "Tooling" section under the "Releases" section (#4734)
  frame/proc-macro: Refactor code for better readability (#4712)
  Contracts:  update wasmi to 0.32 (#3679)
  Backport style changes from P<>K bridge to R<>W bridge (#4732)
  New reference doc for Custom RPC V2 (#4654)
  Frame Pallets: Clean a lot of test setups (#4642)
  Fix occupied core handling (#4691)
  statement-distribution: Fix false warning (#4727)
  Update the README to include a link to the Polkadot SDK Version Manager (#4718)
  Cleanup PVF artifact by cache limit and stale time (#4662)
  Update link to a latest polkadot release (#4711)
  [CI] Delete cargo-deny config (#4677)
  fix build on MacOS: bump secp256k1 and secp256k1-sys to patched versions (#4709)
  Unify dependency aliases (#4633)
  ...
Ank4n pushed a commit that referenced this pull request Jun 14, 2024
This branch propose to avoid clones in append by storing offset and size
in previous overlay depth.
That way on rollback we can just truncate and change size of existing
value.
To avoid copy it also means that :

- append on new overlay layer if there is an existing value: create a
new Append entry with previous offsets, and take memory of previous
overlay value.
- rollback on append: restore value by applying offsets and put it back
in previous overlay value
- commit on append: appended value overwrite previous value (is an empty
vec as the memory was taken). offsets of commited layer are dropped, if
there is offset in previous overlay layer they are maintained.
- set value (or remove) when append offsets are present: current
appended value is moved back to previous overlay value with offset
applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of
duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor
differents directions (transients blob storage for instance), as the
current append mechanism is a bit tricky (having a variable length in
first position means we sometime need to insert in front of a vector).

Fix #30.

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: EgorPopelyaev <[email protected]>
Co-authored-by: Alexandru Vasile <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Oliver Tale-Yazdi <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: Liam Aharon <[email protected]>
Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: Branislav Kontur <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
This branch propose to avoid clones in append by storing offset and size
in previous overlay depth.
That way on rollback we can just truncate and change size of existing
value.
To avoid copy it also means that :

- append on new overlay layer if there is an existing value: create a
new Append entry with previous offsets, and take memory of previous
overlay value.
- rollback on append: restore value by applying offsets and put it back
in previous overlay value
- commit on append: appended value overwrite previous value (is an empty
vec as the memory was taken). offsets of commited layer are dropped, if
there is offset in previous overlay layer they are maintained.
- set value (or remove) when append offsets are present: current
appended value is moved back to previous overlay value with offset
applied and current empty entry is overwrite (no offsets kept).

The modify mechanism is not needed anymore.
This branch lacks testing and break some existing genericity (bit of
duplicated code), but good to have to check direction.

Generally I am not sure if it is worth or we just should favor
differents directions (transients blob storage for instance), as the
current append mechanism is a bit tricky (having a variable length in
first position means we sometime need to insert in front of a vector).

Fix paritytech#30.

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: EgorPopelyaev <[email protected]>
Co-authored-by: Alexandru Vasile <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Oliver Tale-Yazdi <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: Liam Aharon <[email protected]>
Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: Branislav Kontur <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A0-needs_burnin Pull request needs to be tested on a live validator node before merge. DevOps is notified via matrix I2-bug The node fails to follow expected behavior. T0-node This PR/Issue is related to the topic “node”.
Projects
Status: Audited
Development

Successfully merging this pull request may close these issues.

Improve performance of storage::append
8 participants