-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a changes trie #67
Comments
Ideally, we'll move accounts off the relay chain eventually, so then overall validators would no longer have special information about accounts changes, only about when parachains made blocks. It's maybe faster to querty parachain nodes to find out when the parachain made blocks. Instead, you could've each account contain the tuple It's also possible to update the account without touching these for specific common events, like rewards, but then instead users track their rewards blocks by tracking the validators they nominate in a similar manor. |
Anyways, there are other reasons to have off-chain database maintained by collators but never tracked even by the parachain, or maybe tracked only by collators but never validated: As an example, we know the current trie does many dumb things, like storing data in internal nodes, radix >2, etc. and stuff like Cosmos' somewhat flat trie is much better. If you've some new storage type without non-membership proofs, then accounts could be stored at random leaves in a flat depth binary merkle tree. We then have collators maintain a seperate non-merklized data structure of which account lives at which index. This could be more efficent than anything used by other chains. As a similar optimization, collators could track account changes, either like this RFC discusses, or by storing history in this off-chain non-merklized data structure, or by having a seperate merklaized data structure for these In what threat model do we prioritize users locating their account history quickly? It's a liveness concern so yes we could definitely leave this up to collators, but this requires off-chain data structures maintained by collators in parallel to the parablocks. It's does not require relay chain changes, but does require lite client changes. |
So you'd only be able to query changes on and archive node (since full nodes would prune almost all changes up to finalisation)? |
The changes trie in itself is not very useful if you then can't read the storage, so to me it makes sense that only archive nodes keep the full changes trie. In #59 I suggest that full nodes should be able to serve the storage of the finalized block minus 16. After a quick discussion with @andresilva, it might be better to put the changes trie root in the MMR itself as a field of the leaves. |
This issue is a "pre-RFC". Writing a complete RFC would be time consuming (especially because it requires some technical knowledge about BEEFY that I'm currently lacking), so I would like to gather some feedback first.
The idea of making the changes trie asynchronous (explanations below) comes from @gavofyork.
Context
It is currently notoriously difficult to obtain for example the history of the balance of an account.
The only way to do this in a trustless way currently is to download each block of the entire chain one by one, and check if the balance has been modified by this block. Doing this is an extremely expensive operation that would likely take multiple days.
One elegant way to solve this problem is to store somewhere (explanations below) the history of the changes of each storage key. Doing this is very elegant as it automatically contains the history of everything that happened on the chain, and doesn't need any case-by-case handling by the various pallets.
In order to make it possible for light clients to access this history in a trustless way, the proposed method is to organize this history into a trie whose root hash is then made trustless. This trie is called the changes trie.
History
Substrate used to have a changes trie, but it was removed due to being too heavy on I/O operations. In other words, this was slowing down blocks production and verification too much.
In order to solve this problem, this RFC proposes to make validators construct the changes trie and then vote for it through the exact same mechanism as BEEFY votes for an MMR root.
In order to make it possible for validators to vote on a single changes trie root, this changes trie need to cover the entire history of the chain, rather than only one block like was the case for the previous changes trie.
Details
Here is a list of the changes:
concat(":current", key_of_state_trie)
and that values are set to0
(the block number where this key was last modified).concat(":current", key_of_state_trie)
and set its value toblock_number
.concat(block_number, key_of_state_trie)
to the value atconcat(":current", key_of_state_trie)
, then set the changes trie entry atconcat(":current", key_of_state_trie)
toblock_number
.concat(":current", key_of_state_trie)
.block_number
and keep only the key and hash of the subtree that they have removed. Archive nodes (as defined in Add a discovery mechanism for nodes based on their capabilities #59) do not remove the subtree from their database.In order words, the changes trie would contain a subtree of prefix
:current
, and one subtree for each block number. The subtree of prefix:current
contains the latest block where each key of the state has last been modified, and the subtree of each block number contains, for each key, the previous block number where it has been modified.This scheme makes it possible to progressively iterate down block numbers, looking up only blocks where the entry has effectively been modified.
The exact keys and encoding of values are all details that aren't important at the moment. The block number could be encoded in a way to make it possible to strip chunks of block numbers from the database entire, for example you could replace the changes of all blocks between 0 and 100000 with a single hash.
Drawbacks:
:current
subtree and at various branches, the block number where an entry has last been removed.Unresolved questions:
storage_set(key, storage_get(key))
), do we update the changes trie or not?As I'm not very familiar with BEEFY, I hope that what is suggested is possible.
Not covered by this issue is the fact that we would add a networking protocol to make it possible to query the content of the changes trie from archive nodes through the peer-to-peer network.
The text was updated successfully, but these errors were encountered: