Removal of Marlowe Runtime's intermediate Chain Indexer #768
Replies: 5 comments 5 replies
-
The node already supports UTxO queries. Why do we need a new follower for the UTxO index?
|
Beta Was this translation helpful? Give feedback.
-
I don't think we have concrete evidence of this.
|
Beta Was this translation helpful? Give feedback.
-
Overall, I think we should take a product- and evidence-based approach to this and reframe the question more radically. Here are a couple of alternatives for consideration.
The preponderance of the limited user-derived evidence we have is that the last item above is their preference. |
Beta Was this translation helpful? Give feedback.
-
The most complex example of a helper script is the Charli3 oracle bridge for Marlowe. It requires information about Should the revised indexing solution support such queries? or would deployers of such contracts need to use other indexing solutions? |
Beta Was this translation helpful? Give feedback.
-
Given the ever-larger resource footprint of |
Beta Was this translation helpful? Give feedback.
-
Overview
This discussion aims to establish a plan to remove the
marlowe-chain-sync
andmarlowe-chain-indexer
from the Marlowe Runtime. These components account for a significant portion of resource consumption by the Runtime, limiting its scalability and impairing its performance. By analyzing the consumption patterns of downstream components, we can eliminate the need for these components without compromising the architectural flexibility they provide.Benefits provided
The main benefit provided by the chain index is flexibility. Without it, components like
marlowe-sync
andmarlowe-tx
need to communicate directly with a node to index the information they need. This can take quite a long time when starting from genesis, and if these components change in such a way that they require new information, the whole process must start from scratch. By acting as a runtime-wide cache of block and transaction data, the chain index makes this far less expensive, and it is comparatively inexpensive to traverse.This was very valuable in the early stages of Runtime development because these sorts of changes happened quite frequently. However, as the Runtime matures and stabilizes, they happen less often. It is also worth noting that this problem is still present if the chain index doesn't contain information that we will need at some point - notably scripts.
Costs
The cost of maintaining the chain index is high. It demands significant disk space (for mainnet, this is on the order of hundreds of gigabytes) from the host environment, which hurts scalability. Query performance can also be problematic due to the size of the databases. Both of these problems result from keeping far more data than the Runtime needs to function as a hedge against new information being required in the future. As stability and confidence in data requirements grow, these costs outweigh the benefits. The requirements proposed here show that we have reached the point where this tradeoff is no longer worth the cost.
What do we actually need?
there are two types of information we need from the chain: historical and current (i.e. UTxO). The scopes of these two categories are quite different. Historical information is only needed by
marlowe-sync
and its scope is limited to Marlowe contract and payout withdrawal transactions. UTxO information is needed bymarlowe-tx
to build transactions.Some key observations to make here are:
marlowe-tx
onlyWe can leverage these observations to find an optimal solution:
marlowe-indexer
)Indexing history
Not much needs to change from how this currently happens in
marlowe-indexer
. We can switch to indexing this information directly from a chain sync follower. In future, when we add additional information, we will need to reindex from genesis, but this should happen infrequently enough to be worthwhile.Indexing the UTxO
The UTxO index can be built by a new chain follower component for
marlowe-tx
. The UTxO index only needs to map certain lookup keys to TxIns (in a format that can be rolled back up to k blocks). It does not need to maintain the full output data because if we have the corresponding TxIds, it is efficient to query directly from the node.The lookups we need to maintain are:
Given these lookups, we can build the UTxO required for:
Beta Was this translation helpful? Give feedback.
All reactions