diff --git a/docs/architecture/adr-051-arbitrary-protobuf-ipld-support-scheme.md b/docs/architecture/adr-051-arbitrary-protobuf-ipld-support-scheme.md new file mode 100644 index 000000000000..1a67d6a54935 --- /dev/null +++ b/docs/architecture/adr-051-arbitrary-protobuf-ipld-support-scheme.md @@ -0,0 +1,228 @@ +# ADR-051: Arbitrary Protobuf IPLD Support Scheme + +## Changelog + +- Feb 14th, 2022: Initial Draft + +## Status + +DRAFT + +## Abstract +This ADR describes a generic IPLD content addressing scheme for the arbitrary protobuf types stored in SMT referenced state storage. + +## Context + +Because all values stored in the state of a ComsosSDK chain are protobuf encoded, we are presented with a unique +opportunity to provide extremely rich IPLD content-typing information for the arbitrary types stored in +(referenced from) the SMT in a generic fashion. + +### SDK context + +The SDK stores values in state storage in a protobuf encoded format. Each module defines their own custom types in +.proto definitions and these types are registered with a ProtoCodec that manages the marshalling and unmarshalling of these +types to and from the binary format that is persisted to disk. As such, an indefinite/unbounded number of protobuf types +need to be supported within the Cosmos ecosystem at large. + +Rather than needing to register new content types and implement custom codecs for every Cosmos protobuf type we +wish to support as IPLD it would be useful to have a generic means of supporting arbitrary protobuf types. This would +open the doors for some interesting features and tools. +For example: a universal (and richly typing) IPLD block explorer for all/any blockchains in the Cosmos ecosystem that +doesn't require custom integrations for every type it explores and represents. + +### IPLD context + +[IPLD](https://ipld.io/docs/) stands for InterPlanetary Linked Data. +IPLD is a self-describing data model for arbitrary hash-linked data. +IPLD enables a universal namespace for all Merkle data structures and the representation, access, and exploration of these +data structures using the same set of tools. IPLD is the data model for IPFS, Filecoin, and a growing number of other +distributed protocols. + +Some core concepts of IPLD are described below. + +### IPLD Objects + +IPLD objects are abstract representations of nodes in some hash-linked data structure. In this representation, +the hash-links that exist in the native representation of the underlying data structure +(e.g. the SHA-2-256 hashes of an SMT leaf node) are transformed into CIDs- hash-links that describe the +content they reference. + +### IPLD Blocks + +IPLD blocks are the binary encoding of an IPLD object that is persisted on disk. +They contain (or are) the binary content that is hashed +and referenced by CID. For blockchains, this binary encoding of an IPLD object will correspond to the consensus encoding +of the represented object. E.g. for Ethereum headers, the IPLD block is the RLP encoding of the header. + +For Tendermint-Cosmos blockchains, the IPLD blocks are the consensus binary encodings +of the Merkle/SMT nodes of the various Merkle trees and the values they reference. + +### CIDs + +CIDs are self-describing content-addressed identifiers for IPLD blocks. They are composed of a content-hash of an +IPLD block prefixed with bytes that identify the hashing algorithm applied on that block to produce that +hash (multihash-prefix), a byte prefix that identifies the content type of the IPLD +block (multicodec-content-type), a prefix that identifies the version of the CID itself (multicodec-version), +and a prefix that identifies the base encoding of the CID (multibase). + +` ::= ` + +### IPLD Codecs + +IPLD blocks are converted to and from in-memory IPLD objects using IPLD codecs, which marshal/unmarshal an IPLD object +to/from the binary IPLD block encoding. Just as `multihash-content-address` maps to a specific hash function or +algorithm, `multicodec-content-type` maps to a specific codec for encoding and decoding the content. These mappings are +maintained in a codec registry. These codecs contain custom logic that resolve native hash-links to CIDs. + +## Decision + +We need to define three new IPLD `multicodec-content-type`s and implement their codecs +in order to support rich content-typing of arbitrary protobuf types. + +### New IPLD content types + +The three new content types are described below, codecs for these types will need to be implemented. For Go, these +implementations will work with [go-ipld-prime](https://github.com/ipld/go-ipld-prime) and will be added to the existing +codecs found in https://github.com/vulcanize/go-codec-dagcosmos. Eventually we may also wish to implement these codecs +in JS/TS (as https://github.com/vulcanize/ts-dag-eth is to https://github.com/vulcanize/go-codec-dageth). + +### Self-Describing Protobuf Multicodec-Content-Type + +Define a new `multicodec-content-type` for a +[self-describing protobuf message](https://developers.google.com/protocol-buffers/docs/techniques?authuser=2#self-description): +`self-describing-protobuf`. This `multicodec-content-type` will be used to create CID references to the protobuf encodings of such +self-describing messages. + +CIDv1 for a `self-describing-protobuf` block: + +`` + +### Typed-Protobuf Multicodec-Content-Type + +Define a new `multicodec-content-type`- `typed-protobuf`- that specifies that the first 32 bytes of the hash-linked IPLD +block are an SHA-2-256 hash-link to a self-describing protobuf message (to a `self-describing-protobuf` IPLD block, as described above) +that represents the contents of the .proto file that compiles into the protobuf message that the remaining bytes of the +hash-linked IPLD object can be unmarshalled into. + +In this `multicodec-content-type` specification we also stipulate that the content-hash referencing the IPLD block +only includes as digest input the protobuf encoded bytes for the storage value whereas the `self-describing-protobuf` hash-link prefix +is excluded. +This is a significant deviation from previous IPLD codecs, as it means the content-hash is not a hash of *all* of the +content it references, but is necessary to maintain the native consensus hashes and hash-to-content mappings. + +In other-words, the `content-hash` in a `typed-protobuf` CID will be +`hash()` (when using SHA-2-256 multihash type, this matches the hash we see in the Cosmos SMT) +instead of `hash()`. + +Otherwise `content-hash` would not match the hashes we see natively in the Tendermint+Cosmos Merkle DAG, and we +would not be able to directly derive CIDs from them. + +CIDv1 for a `typed-protobuf` block: + +`` + +Another major deviation this necessitates is the requirement of an IPLD retrieval, unmarshalling, and protobuf +compilation step in order to fully unmarshal the `referenced-protobuf-encoded-value` stored in a `typed-protobuf` block. + +The algorithm will look like: + +1. Fetch the `typed-protobuf` block binary using a `typed-protobuf` CID +2. Decode the `self-describing-protobuf` CID from the block's hash-link prefix +3. Use that `self-describing-protobuf` CID to fetch the `self-describing-protobuf` block binary +4. Decode that binary into a self-describing protobuf message +5. Use that self-describing protobuf message to create and compile the canonical proto message for the `referenced-protobuf-encoded-value` +6. Decode the `referenced-protobuf-encoded-value` binary into the proto message type compiled in step 5 + +### Protobuf-SMT Multicodec-Content-Type + +Define a new `multicodec-content-type`- `protobuf-smt`- for SMT nodes wherein the IPLD object representation of a leaf nodes converts +the canonical `content-hash` (i.e. `SHA-2-256(referenced-protobuf-encoded-value)`) into a `typed-protobuf` CID, as +described above. Intermediate node representation is no different from standard SMT representation. The codec can +make use of the existing byte prefix to differentiate between a leaf and intermediate node, so we do not need a separate +codec for intermediate and leaf nodes. + +### Additional work +#### IPLD aware protobuf compiler + +For simple protobuf definitions which have no external dependencies on other protobuf definitions, this work will not +be necessary- a single standalone self-describing protobuf message will be enough to generate and compile the protobuf +types necessary for unpacking an arbitrary protobuf value. On the other hand, if we have more complex protobuf +definitions with external dependencies (that we cannot inline) we need some way of resolving these dependencies within +the context of the IPLD codec performing the object unmarshalling. To this end, we propose to create an IPLD +aware/compatible protobuf compiler that is capable of rendering and resolving dependency trees as IPLD. + +This could be written as an existing protobuf compiler plugin that allows .proto files to import other proto packages +using `self-describing-protobuf` CIDs as import identifiers. In this way, protobuf dependency trees could be represented +as IPLD DAGs and IPFS could be used as a hash-linked registry for all protobuf types. + +Further specification and discussion of this is, perhaps, outside the content of the SDK. + +#### IPLD middleware for the SDK + +In order to leverage this model for a Cosmos blockchain, features need to be introduced into the SDK (or as auxiliary +services) for + +1. Generating the self-describing messages from the .proto definitions of state storage types in modules +2. Publishing or exposing these as `self-describing-protobuf` IPLD blocks +3. Mapping state storage objects- as they are streamed out using the features in ADR-038 and ADR-XXX- to their +respective `self-describing-protobuf` + +The above process is very similar, at a high level, to the ORM work that has already been done in the SDK. + +These steps will be discussed further in a subsequent ADR (referred to as ADR-YYY for now). + +## Consequences + +There are no direct consequences on the other components of the SDK, as everything discussed here is entirely optional. +In fact, at this stage, everything is theoretical and only exists as an abstract data model with no integration into +the SDK. This model does not impose or require any changes on/to Cosmos blockchain state, it is an abstract representation +of that state which can be materialized in external systems (such as IPFS). + +Nonetheless, there are consequences for defining and attempting to standardize a new abstract data model for Cosmos state. +The approval of this model should only occur once it has been determined to a satisfactory degree that it is the best +available model for representing arbitrary Cosmos state as IPLD. + +In order to introduce generic support for arbitrary protobuf types in state storage, the approach proposed here deviates +from previous IPLD codecs and content-hash-to-content standards. For better, or worse, this could set new precedents for +IPLD that need to be considered within the context of the greater IPLD ecosystem. We propose that this work be seen as +extensions to the standard CID and IPLD concepts and believe that these types of deviations would provide improved +flexibility/generalizability of the IPLD model in other contexts as well. + +### Backwards Compatibility + +No backwards incompatibilities. + +### Positive + +Define a generic IPLD model for arbitrary Cosmos state. This model will enable universal IPLD integration for any and +all "canonical" Cosmos blockchain (e.g. if they don't use SMT or don't require Protobuf encoding of state values, +this falls apart), improving interoperability with various protocols that leverage IPLD. The concrete implementation of +the tools to (optionally) integrate and leverage this model within the SDK will be proposed and discussed in a later ADR. + +### Negative + +Code and documentation bloat/bandwidth. + +Because of the deviations we make from the current historical precedence for IPLD, we suspect upstreaming registration +and support of these codecs into Protocol Labs repositories will be a complicated process. + +### Neutral + +Nothing comes to mind. + +## Further Discussions + +We need to complete the proposals for ADR-XXX (SMT Node State Streaming/Listening Features) and ADR-YYY (IPLD Middleware) +to provide all the necessary context for this ADR. + +## Test Cases + +None in the SDK, there will be encoding/decoding tests for the IPLD codecs in the codec repo linked below. + +## References + +* Existing IPLD codec implementations for Tendermint and Cosmos data structures: https://github.com/vulcanize/go-codec-dagcosmos +* Existing IPLD Spec/Schemas for Tendermint and Cosmos data structures: https://github.com/vulcanize/ipld/tree/cosmos_specs/specs/codecs/dag-cosmos +* Tendermint and Cosmos IPLD Schemas discussion: https://github.com/cosmos/cosmos-sdk/discussions/9505 +* First mention of supporting the arbitrary protobuf types in IPLD: https://github.com/cosmos/cosmos-sdk/issues/7097#issuecomment-742752603 +* Issue describing various options (including this one) for supporting Cosmos protobuf types as IPLD: https://github.com/vulcanize/go-codec-dagcosmos/issues/23 diff --git a/docs/architecture/adr-052-state-commitment-listening.md b/docs/architecture/adr-052-state-commitment-listening.md new file mode 100644 index 000000000000..f4e3a979acf7 --- /dev/null +++ b/docs/architecture/adr-052-state-commitment-listening.md @@ -0,0 +1,224 @@ +# ADR-052: State Commitment Listening + +## Changelog + +- March 14th, 2022: Initial Draft + +## Status + +DRAFT + +## Abstract + +This ADR describes the features necessary to enable real-time streaming of updated state commitment (SMT) nodes to an +external data consumer. + +## Context + +ADR-038 introduced features that enable listening to state changes of individual KVStores, by emitting key value pairs +as they are updated. For many applications this is a sufficient or even preferable method for listening to state changes, +as it allows for the selective listening of updated KVPairs in specific modules and does not impose the additional overhead +necessary to stream all the updates at the state commitment layer. But, for applications that wish to extract the entire Merkle +data structure from a CosmosSDK blockchain in order to retain the provability of state we need additional features +for listening to all the updates at the state commitment layer. + +### Eventual state +The complete state as it exists at a given height `n`: the entire SMT and all the values it references. + +### Differential state +The subset of the state at `n` that includes only the nodes and values that were updated during the transition from +`n-1` to `n`- aka a "statediff". + +Cumulatively, all the statediffs from chain genesis to the current block `n` will materialize the +eventual state at `n` as well as all historical eventual states below `n`. Additionally, these difference sets can be +(relatively quickly) iterated to produce useful indexes and relations around and between tree nodes/values and other +Tendermint/Cosmos data structures. + +## Decision + +There are at least four potential approaches for extracting the entire state + state commitment data structure from a +CosmosSDK blockchain to an external destination. All four are discussed below with their pros and cons as well as +an explanation of the work that remains to be done to support each of these approaches. + +### Replay of KVPairs +One way the entire Merkle data structure can be extracted to an external system would be to replay all the KVPairs +listened to using the features implemented in ADR-038. If we listen to every KVPair emitted from every module in this +manner, and then reinsert them into an external SMT the SMT can be recapitulated in full. + +Pros: +* No additional changes needed within the SDK. + +Cons: +* Requires reinserting every KVPair into an external SMT and re-materializing the entire state commitment data structure, +duplicating work that already occurs within the Cosmos blockchain. +* KVPairs must be replayed in the correct order. +* If at any point any single KVPair is missed a significant number of the external SMT nodes materialized in this manner (including the root) +would not match the canonical nodes, and it would be extremely difficult to identify what KVPair is missing in order to repair +the SMT. +* If we want to build a path index around these external SMT nodes, we require an additional service or another +implementation of the SMT that enables the mapping of this path information during or after external SMT node materialization. +* Does not provide a means to horizontally scale the extraction and processing of historical state, to extract historical +state with this approach we must sync a node from genesis and replay all KVPairs in order which cannot be readily parallelized. + +We believe this approach is inadequate from both a performance perspective due to the need to replay every KVPair +insert/update/delete and re-materialize SMT nodes that have already been materialized in the SMT that backs the blockchain +and also from a feature perspective as it does not provide a direct path forward for generating a path index around all +state commitment nodes. Additionally, it does not provide a parallelizeable means of extracting historical state +commitment data. + +### Database versioned snapshot iteration +We can, in theory, leverage the versioned snapshot architecture of the databases (Badger and RocksDB) underpinning the +SMT to extract the updated nodes at a specific block. The SDK uses Badger and RocksDB transactions as +the `MapStore` interface that the SMT implementation writes to. These transactions create a database with versioned snapshots, +these snapshots will contain (reference) all the SMT nodes (`hash(node) => node` kv mappings) that exist in the SMT at that +height. These snapshots may also include all the other data written to disk, e.g. the B1 and B2 buckets maintained at the +SDK layer and the `hash(key) => value` mapping maintained by the SMT. + +If we can devise an efficient way of extracting only the node information from the snapshot +(e.g. stipulate a prefix for this keyspace) then we could extract all the nodes at height `n` and `n-1`, find their intersection, +and remove this intersection from the nodes at `n` to produce the difference set at `n`. + +Pros: +* No additional changes needed within the SDK. +* Provides a means to generate statediff objects at any arbitrary height. +* Can horizontally scale processing of historical data using this approach by creating snapshots of the database, +fs overlays of these snapshots, and iterating the state in parallel across separate processes (across separate block ranges). +* Requires fewer DB round-trips compared to difference iteration approach (below). + +Cons: +* In order to generate the difference set, we need access to state at both `n` and `n-1`. +* Depends on the database implementation underpinning the SMT to support iterateable versioned snapshots. +* Need to update the SMT implementation to introduce keyspace prefixing for the `hash(node) => node` bucket. +* Performance is likely prohibitive. The sets of nodes at `n` and `n-1` will be *very* large, loading these sets +and finding their intersection will be expensive. Time complexity of finding intersection: `O(x+y)` where `x` +is number of nodes at `n` and `y`is number of nodes at `n-1`. Memory intensive- ideally we would load the entire node +sets at `n` and `n-1` into memory in order to find their intersection. +* We are missing the path context provided by generating this difference set during a tree difference iteration- if we +want to build a path index around the SMT nodes, we require an additional process. + +This approach is insufficient for any scenario where we want to extract and maintain path information for the SMT +nodes, as this context is missing when iterating the flat database keyspace. Time complexity is linear with respect +to the number of nodes in the SMT. For these reasons, we propose [SMT difference iteration](#smt-difference-iteration) +for the extraction of historical state commitment data. + +### SMT difference iteration +Another approach for extracting the entire state commitment data structure to an external system would be to implement +an SMT node difference iterator. This approach would be agnostic towards the backing database. +The difference iterator is a tree iterator that simultaneously iterates the trees at height `n` and `n-1` in +order to traverse only the nodes which differ between the two trees. In this manner, it can produce and emits a "statediff" +objects for the SMT at height `n`. + +At a high level this approach has three steps: +1. Implement basic iterator for SMT. +2. Implement difference iterator for SMT, uses above. +3. Implement standalone statediffing service that uses the above difference iterator to generate and process statediffs. + +Pros: +* Underlying DB agnostic. +* Does not require any changes to the underlying SMT implementation. +* More performant than replaying KVPairs to maintain an external materialization of the SMT. +* Time complexity is linear with respect to the number of nodes in the *differential state* of the SMT, not the entire +SMT. +* Provides a means to generate statediff objects at any arbitrary height. +* Can horizontally scale processing of historical data using this approach by creating snapshots of the database, +fs overlays of these snapshots, and iterating the state in parallel across separate processes (across separate block ranges). +* Difference iterator is aware of the path of the SMT nodes during iteration. This allows it to generate an index around +SMT node path, this path association is useful for efficient generation +of proofs (enables us to select all the nodes along a path from the root to a specific child with a single query, rather +than having to iterate down the tree using multiple database lookups). +* Additional middleware/hooks can be plugged into the difference iterator and/or statediffing service in order to generate useful +indexes or associate additional metadata with the nodes (e.g. IPLD Middleware, to-be proposed in a following ADR, could associate +the appropriate multicodec types with the processed nodes and values). + +Cons: +* Requires the implementation of an SMT node difference iterator and a statediffing services that use it. +* In order to generate the difference set, we need access to state at both `n` and `n-1`. +* Because neither Badger nor RocksDB support concurrent read access from multiple system processes, if using this approach +to process real-time data at the head of the chain we would need to run this difference iteration from within the context +of the CosmosSDK blockchain system process. +* Requires a lot of round-trips to the DB (to iterate the difference set). + +This approach enables mapping of node path to the nodes we extract and enables the parallelizable processing of historical state +in a manner that scales linearly with respect to the number of nodes in a difference set. We propose this approach for +historical state commitment processing while using the below approach for real-time processing at the head of the chain. + +### SMT cache/commit cycles with node flushing capabilities +We can implement a cache/commit wrapper around the existing SMT implementation that allows us to flush updated nodes +(and values) from the SMT at the end of every block to produce the difference set for that block. + +To do so we need to: +1. Update the SMT interface with a new method, `Commit() error`. +2. Create a cache/commit wrapper for the SMT. +3. Create an SMT constructor that allows us to pass a listening channel to the cache/commit wrapper. +4. Tie this listening channel into the SDK in a capacity analogous to the plugin-based `StreamingService` introduced in ADR-038. + +Pros: +* Most performant approach. Not only is it the most performant approach for extracting all SMT nodes from a CosmosSDK +blockchain in real-time- because it requires no additional tree iteration or node materialization- but additionally the +cache/commit feature necessary to realize this approach may reduce the number of disk operations the SMT must perform in its +role in the SDK and provide a general performance improvement to the SMT based `MultiStore`. +* Additional middleware can be wrapped around the channels used to flush the cached SMT nodes in order to generate useful +indexes or associate additional metadata with the nodes (e.g. IPLD Middleware, to-be proposed in a following ADR, could associate +the appropriate multicodec types with the streamed nodes and values). + +Cons: +* Requires changes to the underlying SMT interface. +* Requires changes to the utilization pattern of the SMT in the SDK. +* The cache/commit feature introduces additional memory overhead and complexity for the SMT. +* The flush feature introduces additional overhead and complexity for the SMT. +* If we wish to associate node path with the SMT nodes while they are streamed out in this capacity, additional implementation +complexity and overhead is introduced into the SMT (we can no longer get away with using a simple wrapper around the existing SMT +implementation). +* Does not provide a means to horizontally scale the extraction and processing of historical state, to extract historical +state with this approach we must sync a node from genesis and listen to all SMT node emissions. + +We believe this is the best approach for extracting the full SMT data structure at the head of the chain in real-time. In +combination with [SMT difference iteration](#smt-difference-iteration) for historical data + +## Consequences + +### SMT difference iteration +No direct impact on the SDK, as this approach will be introduced as an entirely optional auxiliary service and does not +require changing the SMT interface or implementation. + +#### Backwards Compatibility +Does not impact backwards compatibility as no changes are affected to the SDK directly. + +#### Positive +Services capable of extracting the entire historical SMT state to an external destination. This enables an external +system to provide proofs for all Cosmos state. + +#### Negative + +#### Neutral + +### SMT cache/commit cycles with node flushing capabilities +Requires updating the SMT interface used by the SDK, and updating the SDK to use this updated interface. A single +`Commit() error` method needs to be added, but none of the existing methods are altered. Since the `MultiStore` already +operates in a cache/commit cycle, tying this commit interface into the SDK will not be very intrusive. Similarly, we can +reuse the existing plugin-based `StreamingService` framework introduced in ADR-038 for wiring the SMT cache/commit listener +into the SDK. + +#### Backwards Compatibility +Updating the SMT interface to support a flushable cache/commit cycle at the SMT level does not break backwards compatibility +since the rest of the SMT interface is unchanged (aka we can still use the existing SMT access pattern). + +#### Positive +Services capable of streaming the entire SMT state in realtime to an external destination. This enables an external +system to provide proofs for all Cosmos state. + +#### Negative + +#### Neutral + +## Further Discussions + +TODO + +## Test Cases [optional] + +TODO + +## References + +* POC SMT cache/commit listener wrapper: https://github.com/vulcanize/smt/blob/cache_listener_wrap/cache_listening_mapstore.go