Metered Weights in the Polkadot-SDK #49

shawntabrizi · 2023-11-16T20:56:42Z

Before creating a full RFC, I want to start a discussion on a potential direction around improving Weights and Benchmarks in the Polkadot SDK.

Problems to solve

Weights / Benchmarking are identified as one of the more complex parts of using the Polkadot SDK.
Keeping Polkadot-SDK as general as possible for the runtime, allowing for other frameworks to be created.
Improving safety and performance of the Polkadot SDK execution environment.

High Level Ideas

The Polkadot-SDK runtime should take a "step backwards", and introduce weight metering as the base level of execution limiting, rather than the pre-measured weight system that exists today.

Weight Metering is compatible with pre-measured weights, but not vice-versa.

One of the goals of the Polkadot-SDK is to be as general as possible, and allow for customization at each level of the stack, especially the runtime. As I understand, we have chosen a system where execution of a block in the runtime requires knowledge of the weight of that block ahead of time. This appears to be less flexible than using execution metering.

For example, assuming a system where we did execution metering, the runtime could bypass the metering system and directly inject the weights that it knows are correct for a given execution. However, with pre-measured weights, we have no flexibility to implement a metering system within a custom runtime framework.

Benchmarking pushes overhead to developers.

Benchmarking is quite the laborious process, especially with more complex pallets.
It is a large blocker from building an idea, and deploying a product which is relatively safe to use.

If we want to keep Polkadot SDK competitive for innovators and builders, we cannot have this large overhead where other existing platforms do not.

Benchmarking can be extremely pessimistic.

Because we need to use the worst case situation for every extrinsic, the final calculated weights for a block can be much more than the time it actually takes to execute that block. It was previously calculated that a block full of only transactions uses only 60% of the total pre-calculated weight.

Even if extrinsics use weight refunds, it is likely that we wont optimally fill blocks because we only start include an extrinsic in a block if the worst case scenario weight would allow it to fit, not the final weight.

High Level Solutions

At the base Runtime API, support Weight Metering

Blocks and extrinsics being executed in the runtime should provide a max_weight parameter, and fail to execute if the metered weight is higher than the max_weight.

Perhaps this should be an Option where None can be provided to be backwards compatible and the runtime will be forced to provide a pre-calculated weight.

Runtime Should Support Panics

It seems that in order for metering to ever work, we would need to be able to suddenly halt extrinsic execution when the metered weight is beyond the expected max_weight. A panic is the right tool for this, correct?

In any case, allowing panics in the runtime would also improve developer experience since this is a major area where a runtime developer can make a mistake, and make their chain vulnerable to attack.

Weight Metered Database

It is not my suggestion that we provide full weight metering to all execution in the runtime. This would just bring us back to the performance of smart contracts.

Instead, I suggest we create a special DB layer which provides very specific weight information about database access as it happens during runtime execution.

We know that DB operations account for the majority of weight costs in the runtime, and that usually the number of DB operations is also quite low. (We should do basic analysis of existing pre-metered weights to back this up tangibly).

If we only meter the database, and assume that other execution is nominal, then we can get a very high performance environment with high accuracy.

The DB Layer could provide very specific details like exactly where the item exists in the merkle trie (depth, size, neighboring children, if it or other neighboring children have already been cached, etc..). Then with really comprehensive database benchmarks, we can dynamically meter how much weight each data operation would be.

Perhaps it is possible to forgo this minimal overhead when pre-calculated weights already exist, or, this can be used to automatically provide weight refunds when there is knowledge that the db weights are overestimated.

Handling Execution Weight

With a metered database, I suspect we will calculate a majority of the weight used in a block / extrinsic.

However, to get full saftey, we can provide a few different tools:

Custom Additional Weight

We already provide APIs for runtime developers to manually add more weight during extrinsic execution. This can be used to increase the weight where we know that the metered databse is not enough.

In fact, the benchmarking system already splits benchmarking between Wasm execution and the database operations, so we already provide a method for users to actually discover the "missing" weight.

Custom Weight Buffering

We could also allow runtime developers to add their own custom "weight buffer" to keep their extrinsics more safe. For example, we could add an additional 20% overhead to the weight returned by metered database.

The text was updated successfully, but these errors were encountered:

gui1117 · 2024-08-17T00:25:13Z

The DB Layer could provide very specific details like exactly where the item exists in the merkle trie (depth, size, neighboring children, if it or other neighboring children have already been cached, etc..). Then with really comprehensive database benchmarks, we can dynamically meter how much weight each data operation would be.

This would require to specify an abstract database architecture with a specific cache size, and enforce all clients to have a database compatible with this abstraction.
Maybe just enforcing one cache (without all the neighboring children, depth, size, information) can already be a huge improvement.

Also if we have more memory in the runtime with PVM we can also implement some caching inside the runtime itself I guess.

I see we can do this RFC in multiple step:

1- Tracking operations: have a better automatic refund.
- inside the runtime: keep track of most time-intensive runtime-interface call: crypto, hashing, Storage, Trie.
- benchmark them independantly from the ref_time. (we already do it for Storage)
- have the weight related to them separated from ref_time in the dispatch info (e.g.: tracked_operation_ref_time).
- tracked_operation_ref_time gets an automatic precise refund at the end of the transaction by calculating the actual weight from the tracked call.
- (maybe we could even track wasm execution in the tracked_logic_ref_time, as I heard the expensive part of metering was branching when gas is exceeded, not actually counting the number of operation, but it feels difficult for less gain, also considering PVM)
2- Tracking Storage more precisely: even better refund
- enforce an architecture on the database: maybe just a cache size or what you propose with neighbor, depth etc..
- add this information in the Storage runtime interface.
- use this information in the tracking and refund.
3- Have metering and execution stopped when max weight is reached.

ggwpez · 2024-08-17T20:19:11Z

The PVM will allow for deterministic metering. I think the longer term goal is to recompile the runtimes to PVM and then use that.
I am not sure if its worth to do a lot of effort earlier. WASM is just fundamentally flawed in this regard (being a stack machine). The PVM story is probably year out though.

gui1117 · 2024-08-17T22:52:57Z

This keeps open the question of "Weight Metered Database" or the point (2) in my comment.

Or if PVM allow more memory in the runtime we can implement a cache inside the runtime maybe.

shawntabrizi mentioned this issue Apr 1, 2024

Author reward destination through inherent, fees, tips, and treasury polkadot-dropit/node#11

Merged

bkchr mentioned this issue Aug 12, 2024

FRAME: why not refund the weight used for storage access accurately with a new DataBaseAccessExt paritytech/polkadot-sdk#5223

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metered Weights in the Polkadot-SDK #49

Metered Weights in the Polkadot-SDK #49

shawntabrizi commented Nov 16, 2023 •

edited

Loading

gui1117 commented Aug 17, 2024

ggwpez commented Aug 17, 2024

gui1117 commented Aug 17, 2024

Metered Weights in the Polkadot-SDK #49

Metered Weights in the Polkadot-SDK #49

Comments

shawntabrizi commented Nov 16, 2023 • edited Loading

Problems to solve

High Level Ideas

Weight Metering is compatible with pre-measured weights, but not vice-versa.

Benchmarking pushes overhead to developers.

Benchmarking can be extremely pessimistic.

High Level Solutions

At the base Runtime API, support Weight Metering

Runtime Should Support Panics

Weight Metered Database

Handling Execution Weight

Custom Additional Weight

Custom Weight Buffering

gui1117 commented Aug 17, 2024

ggwpez commented Aug 17, 2024

gui1117 commented Aug 17, 2024

shawntabrizi commented Nov 16, 2023 •

edited

Loading