Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute packed leaf values on demand (Option 1) #19

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

macladson
Copy link
Member

@macladson macladson commented Jul 13, 2023

Addresses #10

Keeps the hash field and computes the values field on demand.

Currently a WIP awaiting comparisons to keeping just the values field and computing hash on demand.

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good and getting close to ready I think. I just found one more small optimisation related to the removal of the hash from PackedLeaf::update.

src/tree.rs Outdated Show resolved Hide resolved
src/tree.rs Outdated Show resolved Hide resolved
src/tree.rs Outdated Show resolved Hide resolved
src/packed_leaf.rs Outdated Show resolved Hide resolved
@michaelsproul
Copy link
Member

michaelsproul commented Aug 8, 2023

I did some CPU benchmarking of this branch vs main vs #18.

The results are a bit frustrating.

Slot processing and block processing are both around 1% faster with this branch, but tree hashing seems to be a bit slower (~10%). Full data here: https://docs.google.com/spreadsheets/d/1IbgU8wvLeC1XzDb3tQU-QQM0htHlyjoYlbVnG-kH8Dw/edit?usp=sharing.

The Lighthouse commit used was sigp/lighthouse@bba1526.

The command used was lcli, compiled as so:

cargo build --profile maxperf --bin lcli --features modern,jemalloc

The benchmarking command used was:

RUST_LOG=debug ./target/maxperf/lcli transition-blocks --pre-state-path ~/Programming/lighthouse/state_6300863.ssz --block-path ~/Programming/lighthouse/block_6300864.ssz --post-state-output-path post_state.ssz --exclude-cache-builds --runs 200 2> stock2.log

The log files were then parsed using rg and sed to produce the data in the table.

I had a go at making some more optimisations in this commit, which is dedup3 in the data: b445f24. It seemed to make things worse.

--

If we can demonstrate a clear memory benefit I think we should still go ahead with this change. That might just be a bit trickier to establish. It really seems like this branch should be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants