Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport genesis speed ups to 0.41.x #8722

Merged
merged 3 commits into from
Mar 1, 2021

Conversation

zmanian
Copy link
Member

@zmanian zmanian commented Feb 27, 2021

Backports the genesis speed ups in #8719 to 0.41.x to improve node startup performance.

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

…code (#8719)

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766

Co-authored-by: Alessio Treglia <[email protected]>
@zmanian zmanian changed the title Backport genesis spedes ups to 0.41.x Backport genesis speed ups to 0.41.x Feb 27, 2021
@codecov
Copy link

codecov bot commented Feb 27, 2021

Codecov Report

Merging #8722 (3e01a38) into release/v0.41.x (2a5818c) will increase coverage by 0.03%.
The diff coverage is 62.71%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##           release/v0.41.x    #8722      +/-   ##
===================================================
+ Coverage            61.93%   61.97%   +0.03%     
===================================================
  Files                  611      611              
  Lines                35250    35378     +128     
===================================================
+ Hits                 21831    21924      +93     
- Misses               11136    11155      +19     
- Partials              2283     2299      +16     
Impacted Files Coverage Δ
baseapp/grpcserver.go 2.70% <0.00%> (+0.13%) ⬆️
store/rootmulti/store.go 66.05% <0.00%> (ø)
testutil/network/network.go 90.54% <0.00%> (ø)
x/staking/keeper/keeper.go 56.66% <ø> (-1.40%) ⬇️
x/staking/keeper/validator.go 80.52% <ø> (+0.03%) ⬆️
client/grpc_query.go 29.16% <40.00%> (+0.90%) ⬆️
baseapp/grpcrouter.go 80.00% <50.00%> (-6.05%) ⬇️
x/auth/client/cli/tx_multisign.go 64.67% <68.46%> (+2.99%) ⬆️
store/cachekv/store.go 87.12% <71.42%> (-2.76%) ⬇️
codec/types/any.go 64.00% <100.00%> (+3.13%) ⬆️
... and 10 more

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Have we tested this with gaia?

@alessio
Copy link
Contributor

alessio commented Feb 27, 2021

Not manually, not yet. But I will do shortly 👍

@zmanian
Copy link
Member Author

zmanian commented Feb 27, 2021

I think it's unlikely to be consensus breaking but we need synch cosmoshub-4

@tac0turtle
Copy link
Member

oh sorry, didnt mean in the sense of seeing if its breaks. Meant to see what sort of speed up it brings.

@alessio
Copy link
Contributor

alessio commented Feb 27, 2021

Testing it now. Just started a node:

alessio@phoenix:~/work/gaia$ gaiad --home=./gaiahome start
10:41PM INF starting ABCI with Tendermint
10:41PM INF Starting multiAppConn service impl=multiAppConn module=proxy
10:41PM INF Starting localClient service connection=query impl=localClient module=abci-client
10:41PM INF Starting localClient service connection=snapshot impl=localClient module=abci-client
10:41PM INF Starting localClient service connection=mempool impl=localClient module=abci-client
10:41PM INF Starting localClient service connection=consensus impl=localClient module=abci-client
10:41PM INF Starting EventBus service impl=EventBus module=events
10:41PM INF Starting PubSub service impl=PubSub module=pubsub
10:41PM INF Starting IndexerService service impl=IndexerService module=txindex
10:41PM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=
10:41PM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0

Will keep you posted folks

CHANGELOG.md Outdated Show resolved Hide resolved
@alessio
Copy link
Contributor

alessio commented Feb 27, 2021

Tested. Node bootstrap is definitely much faster.

@zmanian
Copy link
Member Author

zmanian commented Feb 28, 2021

Thank you for testing on a Saturday

CHANGELOG.md Outdated Show resolved Hide resolved
@alessio alessio merged commit 5a6b846 into release/v0.41.x Mar 1, 2021
@alessio alessio deleted the zaki-backport-genesis-speed-up-to-0.41 branch March 1, 2021 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants