-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STALE - DO NOT REVIEW][Persistence] State Commitment - State Hash #152
Conversation
tl;dr @andrewnguyen22 Could you take a look at this for preliminary feedback This is far from complete (no tests, incomplete implementation, little integration, etc...) but the direction and core of how we'll compute the state hash for the MVP is there. It's only done for The short of it is:
This is definitely only going to be something we submit AFTER your major persistence refactor, but I was hoping you could review and see if it looks like the right path given all the context you have (v0, refactoring persistence, etc...). Personally, I think it makes sense but actually turns out to be much simpler than I thought which makes me feel like I'm missing something. |
consensus/module.go
Outdated
@@ -243,7 +243,7 @@ func (m *consensusModule) handleHotstuffMessage(msg *typesCons.HotstuffMessage) | |||
} | |||
|
|||
func (m *consensusModule) AppHash() string { | |||
return m.appHash | |||
return m.appHash // TODO: This is a problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't very helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I should have made the comment: "reminder in this commit: do integration with consensus module."
I was hoping for more preliminary feedback on the business logic in updateStateHash.
Going to leave a comment in the main PR with more pointers.
Preliminary feedback: While, I am accustomed to each write operation actually updating the app-hash by updating the tree as a block is processed, I love the simplicity. Basically, you've adopted a 'update the actor set each block' and that will affect the app-hash. I was originally skeptical of the design, but I'm beginning to appreciate it. My current reservations:
EDIT: Starting to wonder if the merkle tree itself will enforce the ordering for (e) - unsure at this point |
tl;dr Thanks for the review. I've gained confidence that we're heading in the right (simple AND efficient) solution after tending to these questions a - Answered and think we're okay but will add TODO for cache optimizations.
Thank you. Let's keep thinking and discussing it. I have a good feeling about it (i.e. no allergies at the moment) but am also trying to figure out if there's a flaw due to its simplicity.
Given that we have a constant number of trees (i.e. not a dynamic variable but can change), we are actually doing:
See the screenshot below. Given that relational database are usually implemented using B-tree (good at read amplification) and key-value stores are usually implemented using LSM-trees (i.e. good at write amplification), I think this actually gives us a good balance of the two. There is 100% still room for caching and optimization, so I will a TODO or github issue to investigate it. CC @okdas for a telemetry request :)
I'm still in the process of familiarizing myself with all of the details of https://github.com/celestiaorg/smt, but the interface is implemented like this: nodeStore := smt.NewSimpleMap()
valueStore := smt.NewSimpleMap()
trees[treeType] = smt.NewSparseMerkleTree(nodeStore, valueStore, sha256.New()) I will confirm (while working on this commit) by reading the source code, but my understanding is that there is a behind the scenes optimization of using the provider hasher for the store used to compute the root, while the other one computes the values.
Acknoeledged and agreed. Here is the current list of trees in the code. Note that I have added a couple for flags and params, so there is no limitation on just actors. After we agree on a "high level design" with the app, I'll probably need copy-pasta the code to const (
// Actors
AppMerkleTree MerkleTree = iota
ValMerkleTree
FishMerkleTree
ServiceNodeMerkleTree
AccountMerkleTree
PoolMerkleTree
// Data / state
BlocksMerkleTree
ParamsMerkleTree
FlagsMerkleTree
lastMerkleTree // Used for iteration purposes only - see https://stackoverflow.com/a/64178235/768439
)
I was originally referencing the original persistence (see image) spec ad nauseum until you brought up. Thinking about it thought, each tree root is deterministic based on the schema, so the lexographical order of each root should also be deterministic. I don't have a strong preference here and can change it to include the schema name as the ordering key, but have a personal preference toward basic lexographic order simply because it's less moving pieces and provides the same guarantee.
One of the nice things about using Sparse Merkle Tree instead of AVL trees is that the order doesn't matter in terms of tree construction. The tree is not balanced (but optimized in other ways that I can explain if needed), but since the tree path/branch is based solely on the key, we can do it in any order :) |
@Olshansk great response :)
I don't follow this logic, because no-matter the schema, the values are arbitrary bytes - meaning it's very likely and possible the ordering of the trees is going to switch if you base the order around lexicographical sorting of the values. Instead, you should base the order around the lexicographical sorting of the schema_names. A mock example: Strict Ordering: As opposed to Value Ordering |
Thanks for the additional explanation re ordering. SGTM! 💯 |
The learnings/discussions from this PR will be implemented in #284. |
Description
This is currently a WIP PR to compute the state hash.
Fixes #147.
Type of change
Please mark the options that are relevant.
How Has This Been Tested?
make test_all
Checklist