Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM: add simple benchmarking tool for mainnet blocks #794

Merged
merged 9 commits into from
Aug 19, 2020
Merged

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Jun 26, 2020

This helps us see how the VM performs when running mainnet blocks. The script is still WIP and has a few known issues, but it's good for a rough estimate. To use it build the VM and run:

node ./dist/benchmarks/mainnetBlocks.js lib/benchmarks/fixture/blocks-prestate.json

If you want to get a more detailed look to find bottlenecks we can use 0x. So run:

npm i -g 0x
0x dist/benchmarks/mainnetBlocks.js lib/benchmarks/fixture/blocks-prestate.json

and open the link it generates.

@codecov
Copy link

codecov bot commented Jun 26, 2020

Codecov Report

Merging #794 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

Flag Coverage Δ
#account 92.85% <ø> (ø)
#block 77.81% <ø> (-0.04%) ⬇️
#blockchain 81.33% <ø> (+0.30%) ⬆️
#common 93.98% <ø> (-0.16%) ⬇️
#ethash 83.33% <ø> (ø)
#tx 94.16% <ø> (+0.13%) ⬆️
#vm 92.80% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@holgerd77
Copy link
Member

Hi @s1na, great start on this! 😄

Can we move this out of lib on a final setup, otherwise this will bloat bundle size too much?

@evertonfraga evertonfraga marked this pull request as draft June 30, 2020 12:05
@evertonfraga evertonfraga changed the title [WIP] VM: add simple benchmarking tool for mainnet blocks VM: add simple benchmarking tool for mainnet blocks Jun 30, 2020
@s1na
Copy link
Contributor Author

s1na commented Jun 30, 2020

Can we move this out of lib on a final setup, otherwise this will bloat bundle size too much?

Yeah I did that first but couldn't get it quickly working with the current tsconfig and didn't want to struggle too much with that. I can look into it again before merging or would appreciate if somebody has ideas how I can change tsconfig. I didn't want to use npx ts-node because that shows up in the 0x profile and want to run the compiled js directly

s1na and others added 2 commits June 30, 2020 20:06
add `tsconfig.benchmarks.json` and npm run script `build:benchmarks`
add benchmarks instructions to `developer.md`
@ryanio
Copy link
Contributor

ryanio commented Jul 1, 2020

ok great in ee21a7b I did the following:

  1. rebased on master
  2. moved benchmarks dir to scripts dir
  3. added a tsconfig to build the file so all you need to run is npm run build:benchmarks
  4. added your instructions to developer.md

@ryanio
Copy link
Contributor

ryanio commented Jul 1, 2020

The fixture/block-prestate.json file is 79.1mb so I was getting cli warnings from github:

remote: warning: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: warning: See http://git.io/iEPt8g for more information.
remote: warning: File packages/vm/scripts/benchmarks/fixture/blocks-prestate.json is 75.40 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

Can we get this data from Infura? If so I could write a block fetcher script to eliminate the need for storing the fixture.

@s1na
Copy link
Contributor Author

s1na commented Jul 1, 2020

Oh amazing thanks a lot for doing that!

Re the fixture file: I have the scripts already but we unfortunately can't use Infura as it doesn't have the debug_ api enabled. You need a synced node. I ignored the warnings and it still seems to work! But we could either use Git LFS (and add to developer docs and people can fetch the file from lfs) or I can try to bring the file size down. Options I see:

  • Compress file (won't have git history, but don't think that matters much)
  • Trim down number of blocks from 50 to 20 or so
  • I can also change the format and have one prestate for all of the blocks instead of a prestate for each block. Not sure if this will give significant saving but I can try

@holgerd77
Copy link
Member

Added an additional CL option here to determine the number of samples run, together with some other UX improvements.

One thing which catches the eye is that block processing times increase close to linearly with the number of samples run, some output:

30 Blocks

Total number of blocks: 50
Blocks to proceed: 30
running block 9422905
Running block 9422905 took 46 s, 577.091 ms
running block 9422906

3 Blocks

Total number of blocks: 50
Blocks to proceed: 3
running block 9422905
Running block 9422905 took 6 s, 41.155 ms
running block 9422906
Running block 9422906 took 7 s, 60.951 ms
running block 9422907
Running block 9422907 took 9 s, 386.196 ms

(just a small extract, but this is repeatable)

Cause is likely due to the preState being aggregated before block run, since this causes a larger initial state. I nevertheless wonder if this is already a hint to a strong performance bottleneck, since having a linear run time increase here together with a linear state size increase seems very much too much to me.

@s1na Side question: was this an intentional decision to aggregate this preState all before? Wouldn't the results be more repeatable if every block preState is just added/replaced before the respective block execution?

@holgerd77
Copy link
Member

holgerd77 commented Jul 10, 2020

Flamegraphs on this are actually impressive, the following is the 3 blocks graph:

flamegraph_3

The following the 30 block graph (with an additional conditional break after 3 blocks for comparison):

flamegraph_30

This is the source of the single most impressively grown graph block, being in MPT:

flamegraph_30_source

This refers to the TaskExecutor.execute() call from MPT BaseTrie in the _walkTrie() function.

For reference: this is the commit @s1na mentioned being the likely source for speed deterioration in MPT (both necessarily doesn't need to be completely logically interlinked though and can at least partly point to different problems).

@holgerd77
Copy link
Member

Played around with this for 2 hours, some observations and ideas.

One preceeding note: the MPT _walkTrie code with all these nested callbacks and passed-on function executions is absurdly complicated to follow and it is close to impossible to identify structural suboptimalities. This is in SUPER need for a rewrite/refactor, maybe some of the most important ones we have right now I would say.

Ok, on to the performance topic:

  1. The maxPoolSize from MPT used for the task executor seems to have a significant impact on the performance (especially if set significantly lower, e.g. to 20), not completely confirmed but this might be worth for some more structured analysis
  2. I wonder why getting and writing contractCode in StateManager (putContractCode(), getContractCode()` is not going through any cache but always directly operating on the trie. Is there any reason for it? //cc @s1na This would likely have a substantial performance impact otherwise
  3. Most trie accesses (apart from the contract code ones from above) happen in StateManager._cache.flush(). I experimented a bit with using the batch functionality from MPT (see code below) but this didn't speed up things significantly. Might be worth thinking about a more advanced batch implementation on the MPT side (like sorting and taking common paths only once? not sure if practical) but this will likely be a somewhat complex implementation
  4. Not doing the state.commit() in runTx() is actually speeding things up by a factor of 100. I wonder if it's worth providing this as an option for people who just don't need an updated trie after a tx (or block) run? Or am I missing something?

Ok, so far. 🙂 Yes, I can confirm that this is fun. 😜

Tried this for flush():

/**
   * Flushes cache by updating accounts that have been modified
   * and removing accounts that have been deleted.
   */
  async flush(): Promise<void> {
    const it = this._cache.begin
    let next = true
    const batch = []
    while (next) {
      if (it.value && it.value.modified) {
        it.value.modified = false
        it.value.val = it.value.val.serialize()
        //console.log('Written to trie (flush)')
        // await this._trie.put(Buffer.from(it.key, 'hex'), it.value.val)
        batch.push({ type: 'put', key: Buffer.from(it.key, 'hex'), value: it.value.val})
        next = it.hasNext
        it.next()
      } else if (it.value && it.value.deleted) {
        it.value.modified = false
        it.value.deleted = false
        it.value.val = new Account().serialize()
        //console.log('Deleted from trie (flush)')
        // await this._trie.del(Buffer.from(it.key, 'hex'))
        batch.push({ type: 'del', key: Buffer.from(it.key, 'hex') })
        next = it.hasNext
        it.next()
      } else {
        next = it.hasNext
        it.next()
      }
    }
    await this._trie.batch(batch)
  }

@holgerd77
Copy link
Member

Once we get to some stable ground here we should likely also reapply this PR towards a to be created vm/4.x branch branching off the latest v4.2.0 release so that we have some base for comparative runs.

@holgerd77
Copy link
Member

Blog post on how to post a comment from a JS script with GitHub Actions: https://dev.to/koushikmohan1996/writing-your-first-github-action-42cg

Should be doable.

Not sure about the part on how to get results from the target branch (normally: master) and do a comparison within the script against the PR branch results.

@holgerd77 holgerd77 mentioned this pull request Aug 9, 2020
27 tasks
@ryanio ryanio mentioned this pull request Aug 10, 2020
holgerd77
holgerd77 previously approved these changes Aug 19, 2020
Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@holgerd77 holgerd77 marked this pull request as ready for review August 19, 2020 20:34
Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renewed approval after editing merge conflict

@holgerd77 holgerd77 merged commit 780fb78 into master Aug 19, 2020
@holgerd77 holgerd77 deleted the vm/benchmark branch August 19, 2020 20:34
@holgerd77
Copy link
Member

@ryanio Sorry, I was eventually a bit too much in a merge-mode yesterday and didn't check too well 😬 , just gave this another test locally and the script actually doesn't come to an end (on my machine), so this is my output:

$ npm run benchmarks

> @ethereumjs/[email protected] benchmarks /ethereumjs-vm/packages/vm
> node --max-old-space-size=4096 ./scripts/benchmarks/mainnetBlocks.js scripts/benchmarks/fixture/blocks-prestate.json

Total number of blocks in data set: 15
Number of blocks to sample: 15
Block 9422905 x 1,879 ops/sec ±8.36% (81 runs sampled)
Block 9422906 x 2,070 ops/sec ±1.05% (89 runs sampled)
Block 9422907 x 1,535 ops/sec ±14.69% (69 runs sampled)
Block 9422908 x 1,624 ops/sec ±6.96% (73 runs sampled)
Block 9422909 x 1,856 ops/sec ±3.67% (83 runs sampled)
Block 9422910 x 1,938 ops/sec ±1.23% (91 runs sampled)
Block 9422911 x 766 ops/sec ±20.47% (36 runs sampled)
Block 9422912 x 1,881 ops/sec ±1.16% (91 runs sampled)
Block 9422913 x 1,845 ops/sec ±1.38% (86 runs sampled)
Block 9422914 x 1,792 ops/sec ±2.72% (82 runs sampled)
Block 9422915 x 618 ops/sec ±17.75% (33 runs sampled)
Block 9422916 x 344 ops/sec ±73.57% (34 runs sampled)
Block 9422917 x 1,648 ops/sec ±7.34% (83 runs sampled)
Block 9422918 x 1,711 ops/sec ±3.01% (78 runs sampled)
Block 9422919 x 1,750 ops/sec ±1.19% (87 runs sampled)

Here, the script comes to a halt and doesn't continue or finish.

Also, the CI run on master failed, seems there need to be the package-lock.json changes in the working directory being discarded first.

@ryanio
Copy link
Contributor

ryanio commented Aug 20, 2020

@holgerd77 no worries! That is the end of the script actually, the results are then used for github-action-benchmarks

I opened #838 to resolve the package-lock issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants