Add `benchmark overhead` to `frame-omni-bencher` #5303

skunert · 2024-08-09T13:25:56Z

The benchmark overhead command should be implemented for frame-omni-bencher.

Currently its not clear how to run benchmark overhead for a parachain.

Omni bencher currently does not support the command.

polkadot-sdk/substrate/utils/frame/omni-bencher/src/command.rs

Line 134 in 6911f50

"Only the `v1 benchmark pallet` command is currently supported".into()

Polkadot-parachain does not support it either.

polkadot-sdk/cumulus/polkadot-parachain/src/command.rs

Lines 489 to 513 in 6911f50

    
           			match cmd { 
        
           				#[cfg(feature = "runtime-benchmarks")] 
        
           				BenchmarkCmd::Pallet(cmd) => runner.sync_run(|config| { 
        
           					cmd.run_with_spec::<HashingFor<Block>, ReclaimHostFunctions>(Some( 
        
           						config.chain_spec, 
        
           					)) 
        
           				}), 
        
           				BenchmarkCmd::Block(cmd) => runner.sync_run(|config| { 
        
           					let node = new_node_spec(&config, &cli.node_extra_args())?; 
        
           					node.run_benchmark_block_cmd(config, cmd) 
        
           				}), 
        
           				#[cfg(feature = "runtime-benchmarks")] 
        
           				BenchmarkCmd::Storage(cmd) => runner.sync_run(|config| { 
        
           					let node = new_node_spec(&config, &cli.node_extra_args())?; 
        
           					node.run_benchmark_storage_cmd(config, cmd) 
        
           				}), 
        
           				BenchmarkCmd::Machine(cmd) => 
        
           					runner.sync_run(|config| cmd.run(&config, SUBSTRATE_REFERENCE_HARDWARE.clone())), 
        
           				#[allow(unreachable_patterns)] 
        
           				_ => Err("Benchmarking sub-command unsupported or compilation feature missing. \ 
        
           					Make sure to compile with --features=runtime-benchmarks \ 
        
           					to enable all supported benchmarks." 
        
           					.into()), 
        
           			} 
        
           		},

And you can not use the polkadot binary because it is not having the same hostfunctions enabled (e.g. ext_storage_proof_size_storage_proof_size_version_1).

Noticed while playing with #5283

The text was updated successfully, but these errors were encountered:

ggwpez · 2024-08-09T14:44:26Z

Polkadot-parachain does not support it either.

Oh... adding it to the omni-bencher would be good long term, yes. We can probably re-use some mechanisms from the try-runtime-cli since that is also building blocks.
The annoying part are the Inherent Data Providers, since they are on the node side and we dont know in advance what inherent data the runtime needs.
So the omni-bencher needs to inject all possible IDPS that we have (and also the ones from ORML etc). Otherwise if one inherent is missing then the block will not build.

skunert · 2024-08-09T15:28:27Z

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

ggwpez · 2024-08-10T14:12:53Z

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

WDYM? IIUC the omni-node would just call into the shared omni-bencher code. This de-duplicates all of this logic and puts it into one place.
But now thinking about it, the omni-node would be a superset of the omni-bencher. So maybe the omni-bencher would not be needed anymore 🤔
Currently it is nice to have though, since it compiles faster in CI and does not require any features to be enabled.

skunert · 2024-08-15T13:40:05Z

My impression was that we will have no benchmarking related code in the omni node and people will use the omni bencher for benchmarking exclusively.

This is also what is outlined here: #4966

ggwpez · 2024-08-15T14:08:09Z

My impression was that we will have no benchmarking related code in the omni node and people will use the omni bencher for benchmarking exclusively.

Yes even better. I thought you meant the opposite with this:

For a start, the omni-bencher could support what the omni-node will support. That should be fine for most use cases.

But nevermind, seems like we think the same thing.

kianenigma · 2024-08-29T10:57:07Z

On the benchmark side, everything should be possible to do in frame-omni-bencher.

As noted, omni-node will have no old-school, runtime-coupled benchmarking abilities.

As noted here, we could embed frame-omni-bencher as-is into the omni-node as well.

skunert · 2024-08-30T16:22:59Z

I investigated the steps needed for an omni-node friendly benchmark-overhead command.

Context

The benchmark overhead command has the job of providing us with two things:

Base block weight
Base extrinsic weight

To calculate these, it executes a block and monitors its time and the proof recorded during this time. In addition, it will execute a System::remark extrinsic. Historically, only the polkadot binary supported this command; polkadot-parachain never supported it. I am even skeptical if this value was even benchmarked for most runtimes since I only see the same value everywhere. Maybe it was a reasonable guess, and we copied it everywhere?

Moving Parts

There are multiple moving parts that make this work for arbitrary runtimes.

Inherents

Since we want to support multiple runtimes, we need to supply all the inherents the runtime expects. Most parachain runtimes will have similar inherent requirements. The runtime already provides the inherents to the block-builder. All we need to do is populate the InherentData with the required values for the most popular inherents.

Building the Extrinsic

To support multiple runtimes, we must build the extrinsic dynamically based on the runtime's metadata. My first intuition is to employ subxt, which should be able to build this simple extrinsic dynamically. Alternatively, we could let users provide bytes of the extrinsic to execute.

Block Building & Client

The actual block building and client acquisition should be easy. We can use the same FakeRuntimeApi strategy that we already use in the omni-node.

Initial State

Ideally, the runtime should provide the initial state. As previously discussed, we could rely on some dev profile by default. This profile provides us with a valid genesis state on which to build. Eventually, we should inject more accounts into the genesis state to make the benchmark results more accurate. We also need to ensure that the account we signed the remark with is in the genesis state.

Question: How many accounts do we want to inject?

User Facing

The user interface can extend the benchmark overhead command v1. We also need something like --runtime, similar to pallet benchmarking. For now, we can continue to support the --chain arg to support chains that do not provide any genesis presets in the runtime.

One question that remains open for now is how to handle chains that have custom inherents or more exotic setups. We can likely ignore this in the initial version. Still, we could allow parachain teams extension of inherents and transaction building by having a minimal API, similar to what polkadot-parachain-lib is exposing now. This would mean that not everything is possible with omni-bencher, and integration into omni-node is a must.

For an initial iteration, there is also the idea of providing something like a --tx-bytes flag that allows users to pass the bytes of a prepared transaction to the benchmarking.

ggwpez · 2024-09-09T15:09:46Z

One question that remains open for now is how to handle chains that have custom inherents or more exotic setups. We can likely ignore this in the initial version. Still, we could allow parachain teams extension of inherents and transaction building by having a minimal API, similar to what polkadot-parachain-lib is exposing now. This would mean that not everything is possible with omni-bencher, and integration into omni-node is a must.

Either that, or they fork the omni-bencher, or there is a --inherent-data KEY:VALUE argument that can be used to inject any arbitrary inherent data.

For an initial iteration, there is also the idea of providing something like a --tx-bytes flag that allows users to pass the bytes of a prepared transaction to the benchmarking.

Yea good idea.

Question: How many accounts do we want to inject?

Probably should be a CLI command with a default. Currently the PoV benchmarking assumes a million entries per map - but it does not actually populate the maps, just extrapolates the proof size.

gui1117 · 2024-10-26T11:10:06Z

With TransactionExtension being now weighted, we want to change how to calculate the extrinsic overhead not to include the transaction extensions.
The only extrinsic without extensions are bare extrinsics. Unfortunately there is no equivalent of remark but as a valid bare extrinsic in system.
I suggest we have a new valid bare extrinsic only when the runtime is compiled with runtime-benchmarks that we would use for benchmarking the extrinsic overhead.
I am going to start the work on top of the PR.
I will modify ExtrinsicBuilder::build so that we can build signed or bare transaction depending on a new argument.
Add validation for a new no-op call in system when compiled with runtime-benchmarks.
Then benchmark overhead will build bare transaction of this new specific call.
But if we only ever want to benchmark the extrinsic overhead then maybe we can simplify stuff.
If we still want to benchmark like blocks full of signed transfer then current architecture as in the PR is the best.
EDIT: but if we benchmark for a bare extrinsic, we still need to add the weight of signature verification of signed extrinsic :-/, I don't have a full solution yet.

Maybe the easiest will be to just substract the post-dispatch weight of remark + all transaction extensions to the result of the extrinsic overhead benchmark. Then this gets the real overhead without the call.
To get programmatically the post dispatch weight of remark call + all transaction extensions is not easy, but it can be doable.

# Benchmark Overhead Command for Parachains This implements the `benchmark overhead` command for parachains. Full context is available at: #5303. Previous attempt was this #5283, but here we have integration into frame-omni-bencher and improved tooling. ## Changes Overview Users are now able to use `frame-omni-bencher` to generate `extrinsic_weight.rs` and `block_weight.rs` files for their runtime. The core logic for generating these remains untouched; this PR provides mostly machinery to make it work for parachains at all. Similar to the pallet benchmarks, we gain the option to benchmark based on just a runtime: ``` frame-omni-bencher v1 benchmark overhead --runtime {{runtime}} ``` or with a spec: ``` frame-omni-bencher v1 benchmark overhead --chain {{spec}} --genesis-builder spec ``` In this case, the genesis state is generated from the runtime presets. However, it is also possible to use `--chain` and genesis builder `spec` to generate the genesis state from the chain spec. Additionally, we use metadata to perform some checks based on the pallets the runtime exposes: - If we see the `ParaInherent` pallet, we assume that we are dealing with a relay chain. This means that we don't need proof recording during import (since there is no storage weight). - If we detect the `ParachainSystem` pallet, we assume that we are dealing with a parachain and take corresponding actions like patching a para id into the genesis state. On the inherent side, I am currently supplying the standard inherents every parachain needs. In the current state, `frame-omni-bencher` supports all system chains. In follow-up PRs, we could add additional inherents to increase compatibility. Since we are building a block during the benchmark, we also need to build an extrinsic. By default, I am leveraging subxt to build the xt dynamically. If a chain is not compatible with the `SubstrateConfig` that comes with `subxt`, it can provide a custom extrinsic builder to benchmarking-cli. This requires either a custom bencher implementation or an integration into the parachains node. Also cumulus-test-runtime has been migrated to provide genesis configs. ## Chain Compatibility The current version here is compatible with the system chains and common substrate chains. The way to go for others would be to customize the frame-omni-bencher by providing a custom extrinsicbuilder. I did an example implementation that works for mythical: https://github.com/skunert/mythical-bencher ## Follow-Ups - After #6040 is finished, we should integrate this here to make the tooling truly useful. In the current form, the state is fairly small and not representative. ## How to Review I recommend starting from [here](https://github.com/paritytech/polkadot-sdk/pull/5891/files#diff-50830ff756b3ac3403b7739d66c9e3a5185dbea550669ca71b28d19c7a2a54ecR264), this method is the main entry point for omni-bencher and `polkadot` binary. TBD: - [x] PRDoc --------- Co-authored-by: Michal Kucharczyk <[email protected]>

kianenigma · 2024-11-05T12:02:48Z

closed now.

wilwade mentioned this issue Aug 13, 2024

[Chore]: Add Omni Bencher frequency-chain/frequency#2123

Closed

kianenigma mentioned this issue Sep 3, 2024

polkadot-omni-node: Fully remove benchmark sub-command #5564

Open

skunert mentioned this issue Oct 1, 2024

Add overhead benchmark to frame-omni-bencher #5891

Merged

1 task

kianenigma mentioned this issue Oct 28, 2024

Custom Blockchains > Benchmarking polkadot-developers/polkadot-docs#121

Open

kianenigma closed this as completed Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `benchmark overhead` to `frame-omni-bencher` #5303

Add `benchmark overhead` to `frame-omni-bencher` #5303

skunert commented Aug 9, 2024

ggwpez commented Aug 9, 2024 •

edited

Loading

skunert commented Aug 9, 2024

ggwpez commented Aug 10, 2024 •

edited

Loading

skunert commented Aug 15, 2024

ggwpez commented Aug 15, 2024

kianenigma commented Aug 29, 2024

skunert commented Aug 30, 2024

ggwpez commented Sep 9, 2024

gui1117 commented Oct 26, 2024 •

edited

Loading

kianenigma commented Nov 5, 2024

Add benchmark overhead to frame-omni-bencher #5303

Add benchmark overhead to frame-omni-bencher #5303

Comments

skunert commented Aug 9, 2024

ggwpez commented Aug 9, 2024 • edited Loading

skunert commented Aug 9, 2024

ggwpez commented Aug 10, 2024 • edited Loading

skunert commented Aug 15, 2024

ggwpez commented Aug 15, 2024

kianenigma commented Aug 29, 2024

skunert commented Aug 30, 2024

Context

Moving Parts

Inherents

Building the Extrinsic

Block Building & Client

Initial State

User Facing

ggwpez commented Sep 9, 2024

gui1117 commented Oct 26, 2024 • edited Loading

kianenigma commented Nov 5, 2024

Add `benchmark overhead` to `frame-omni-bencher` #5303

Add `benchmark overhead` to `frame-omni-bencher` #5303

ggwpez commented Aug 9, 2024 •

edited

Loading

ggwpez commented Aug 10, 2024 •

edited

Loading

gui1117 commented Oct 26, 2024 •

edited

Loading