Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account compression: Init with root allowing trees with a canopy #6764

Merged
merged 27 commits into from
Sep 13, 2024

Conversation

StanChe
Copy link
Contributor

@StanChe StanChe commented May 27, 2024

Initializing a tree with a root

This PR modifies the account-compression program allowing the creation of the trees that were prepared offline. It allows creating trees with or without a canopy.
Verification of the validity, events, any additional checks and fields (URLs, hashes, etc.), and any additional permissions for the created tree are left to the discretion of the calling program.

The workflow

sequenceDiagram
autonumber
actor TC as Tree creator program
participant AC as account-compression
activate TC
TC ->> AC: prepare_batch_merkle_tree(tree_account, max_depth, max_buffer_size)
AC -> AC: initializes the header, but not the tree
loop For a tree with a canopy
TC ->> AC: append_canopy_nodes(tree_account, start_index, canopy_nodes)
end
TC ->> AC: init_prepared_tree_with_root(tree_account, root, rightmost_leaf, rightmost_index, proof)
AC -> AC: verifies the canopy is valid
AC -> AC: initializes the tree
Loading

The separation of a single init_with_root is based on the fact that some trees to be created might require knowing the pubkey of the tree, like with the bubblegum, where asset ids are derived from the tree pubkey and the index in that tree. Thus the user should first create the tree account and assign all the required permissions and then may take time to prepare the tree itself.

If the tree has a canopy - all canopy leaves should be initialized, otherwise the finalization will fail.

This change is inspired by #6441

@mergify mergify bot added the community Community contribution label May 27, 2024
@StanChe StanChe marked this pull request as draft May 27, 2024 10:12
@StanChe StanChe changed the title Feature/init with root Init with root allowing trees with canopy May 27, 2024
@StanChe StanChe changed the title Init with root allowing trees with canopy Init with root allowing trees with a canopy May 27, 2024
@StanChe StanChe marked this pull request as ready for review May 27, 2024 10:52
@StanChe StanChe changed the title Init with root allowing trees with a canopy Account compression: Init with root allowing trees with a canopy May 27, 2024
@joncinque joncinque requested a review from ngundotra May 28, 2024 22:12
@ngundotra
Copy link
Contributor

Will look at this in mid June when back from OOO. Is there context behind why this needs to get merged into SPL Account Compression, rather than being done in a separate fork of the program?

@StanChe
Copy link
Contributor Author

StanChe commented Jun 3, 2024

Integrating this feature into the standard library will benefit existing projects already utilizing account compression by enabling them to extend their contracts with rollups and batch creation of trees. This will reduce the number of transactions needed for tree creation, thereby alleviating network congestion. The batch creation method ensures that the tree is in the same state as it would be with individual appends, maintaining consistency. The prepared but not finalized tree will be unusable by other existing methods until it's finalized. The responsibility of validating the tree remains with the calling program, while this change guarantees that the tree initialized with a root has a valid canopy. This modification provides a tool for the batch creation of compressed accounts without imposing specific requirements on the caller and leaving the indexing up to their logic.

Copy link
Contributor

@danenbm danenbm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we should fix the stack overflow thing. Looks like this program was last deployed about 4 months ago, maybe it wasn't an issue then, but I worry it could cause runtime issues now, even if the tests pass not sure if failures would be predictable.

Recently we fixed a similar stack overflow issue that started causing test failures in mpl-core-candy-machine by simply heap-allocating a large stack variable: metaplex-foundation/mpl-core-candy-machine@9575c8b

Comment on lines 269 to 309
require_eq!(
*ctx.accounts.merkle_tree.owner,
crate::id(),
AccountCompressionError::IncorrectAccountOwner
);
let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?;

let (header_bytes, rest) =
merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1);
// the header should already be initialized with prepare_tree
let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?;
header.assert_valid_authority(&ctx.accounts.authority.key())?;
let merkle_tree_size = merkle_tree_get_size(&header)?;
let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size);
Copy link
Contributor

@danenbm danenbm Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this block seems like it could be refactored into a helper given its done the same in append_canopy_nodes and similar in prepare_tree (except for initializing header).

But this is a style thing, maybe its nicer to just have the flat structure as-is. Feel free to evaluate this comment and do whatever you think is best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to support the existing code style as close to the original as possible. In a general case, most of the existing methods may be refactored to use that same helper method. In my opinion that may be done as a separate effort that doesn't modify the functionality and only refactores the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great tests!

@StanChe
Copy link
Contributor Author

StanChe commented Jun 10, 2024

@danenbm thanks for the feedback. Decided to move the fix of the stack overflow into a separate PR #6827 in order to simplify this one and not to mix everything up.

@github-actions github-actions bot added the stale [bot only] Added to stale content; will be closed soon label Jul 12, 2024
Copy link
Contributor

@ngundotra ngundotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great PR overall. Great job extending existing design patterns (as bad as they are 😅).

High level requests:

  • There should be an easy way to discriminate normal trees from batch-init'd trees. When we talked previously, I suggested adding a new header version & a flag in that header.

  • What is the recourse for errors in batch-init'd trees? Which steps of the process are reversible or explicitly immutable? For the immutable steps, can we have some explicit documentation & also add helper methods to the SDK prevent immutable errors?

@github-actions github-actions bot removed the stale [bot only] Added to stale content; will be closed soon label Jul 19, 2024
@mergify mergify bot dismissed ngundotra’s stale review July 23, 2024 15:15

Pull request has been modified.

Copy link
Contributor

@ngundotra ngundotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I init a batch tree with prepare_tree and then attempt to append to it?

I think that another byte of padding is needed to store is_initialized for batch init'd trees. Then the tree can be checked quickly before executing replace or append. I'm open to more efficient ways of doing this - just wanted to offer my suggestion.

Again, great job with this. Almost there

Comment on lines 198 to 201
/// filled) -> `finalize_merkle_tree_with_root`. This instruction initializes the tree header
/// while leaving the tree itself uninitialized. This allows distinguishing between an empty
/// tree and a tree prepare to be initialized with a root.
pub fn prepare_tree(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method be renamed to init_batch_merkle_tree to match the other init method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what happens if this method is called on an already-existing tree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method be renamed to init_batch_merkle_tree to match the other init method?

If we take a look at the account structure it may be represented as | header | tree body | canopy |. The existing init_empty_merkle_tree does the full initialization - first the header is initialized, then the tree body, and finally the canopy size is checked. This makes the tree usable by the append methods.
For a batch initialized tree the actual initialization takes several steps. First the prepare_tree initializes just the header and checks the canopy size leaving the rest of the account zero'd. Then the canopy is set using append_canopy_nodes. And finally the finalize_merkle_tree_with_root initializes the tree body.
Until the finalize_merkle_tree_with_root is called the tree remains unusable by any other method - the init_empty_merkle_tree will fail because the header is initialized and any modification will fail because the tree itself is not initialized (the self.is_initialized() check will fail). Only the newly added append_canopy_nodes and finalize_merkle_tree_with_root as well as the modified close_empty_tree can operate over a tree in such a state.
That was also the reason the method naming is different to init_ as it leaves the tree in a not fully initialized state, but with only the initialized header (thus prepared).
I'm considering to rename the finalize_merkle_tree_with_root into init_prepared_tree_with_root and prepare_tree into prepare_batch_merkle_tree. This will keep the naming concise in terms the tree is fully initialized after the init_xxx methods. Will also update the comments accordingly. Wdyt?

Also, what happens if this method is called on an already-existing tree?

Adding a test to showcase this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for explaining. This makes sense. Can you add a simplified version of this comment as a docstring to the function tree_bytes_uninitialized ?

@mergify mergify bot dismissed ngundotra’s stale review July 26, 2024 12:03

Pull request has been modified.

@StanChe
Copy link
Contributor Author

StanChe commented Jul 26, 2024

Also noticed the following stack overflow error occurring on build, fixed it as well

Error: Function _ZN23spl_account_compression23spl_account_compression22init_empty_merkle_tree17hbc30209ef0f8a270E Stack offset of 4928 exceeded max offset of 4096 by 832 bytes, please minimize large stack variables

Copy link
Contributor

@ngundotra ngundotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Almost ready to push to devnet for testing.

Two more comments:

  1. Rebase on top of master
  2. Address some small docs + typos

Once this is done, I'll add a PR that updates versions for & publishes the crates & js sdk.

merkle_tree_apply_fn_mut!(header, tree_id, tree_bytes, initialize,)
}

pub fn tree_bytes_unititialized(tree_bytes: &[u8]) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo in uninitialized

Comment on lines 198 to 201
/// filled) -> `finalize_merkle_tree_with_root`. This instruction initializes the tree header
/// while leaving the tree itself uninitialized. This allows distinguishing between an empty
/// tree and a tree prepare to be initialized with a root.
pub fn prepare_tree(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for explaining. This makes sense. Can you add a simplified version of this comment as a docstring to the function tree_bytes_uninitialized ?

@mergify mergify bot dismissed ngundotra’s stale review July 29, 2024 15:36

Pull request has been modified.

Comment on lines +264 to +266
if canopy[node_idx - 2] != EMPTY {
return canopy[node_idx - 2];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it wasn't for your extensive tests, I definitely would have thought this would be a footgun.

Converting node indexing to canopy indexing almost always throws me off

@github-actions github-actions bot added stale [bot only] Added to stale content; will be closed soon and removed stale [bot only] Added to stale content; will be closed soon labels Aug 19, 2024
@github-actions github-actions bot added the stale [bot only] Added to stale content; will be closed soon label Sep 9, 2024
@github-actions github-actions bot removed the stale [bot only] Added to stale content; will be closed soon label Sep 11, 2024
…tree_with_root as the merkle tree account will not be zeroed at that point
…ounts + updated comments on append_canopy_nodes to reflect the possibility to replace those
…initialization call in a wrapper as it started reporting a stack overflow
bumped the node version to the latest used in some other flows trying to narrow down the build issue
@StanChe
Copy link
Contributor Author

StanChe commented Sep 11, 2024

FYI, had to bump up node to 20.5 for 2 js workflows that were failing on node 16. Looks like eslint became incompatible somehow. Used 20.5 as the latest used in some other flows, 18 should work as well.

@ngundotra ngundotra merged commit b8615ab into solana-labs:master Sep 13, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants