Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libp2p-noise handshake performance #4140

Closed
twoeths opened this issue Jun 10, 2022 · 7 comments · Fixed by #4114
Closed

libp2p-noise handshake performance #4140

twoeths opened this issue Jun 10, 2022 · 7 comments · Fixed by #4114
Assignees
Labels
prio-medium Resolve this some time soon (tm). scope-performance Performance issue and ideas to improve performance.

Comments

@twoeths
Copy link
Contributor

twoeths commented Jun 10, 2022

Investigate why libp2p-noise handshake takes > 7% of cpu time

0522_contabo_20_subscribe_all_subnets.cpuprofile.zip

Screen Shot 2022-06-10 at 17 16 09

@twoeths twoeths transferred this issue from ChainSafe/js-libp2p-gossipsub Jun 10, 2022
@dapplion
Copy link
Contributor

dapplion commented Jun 10, 2022

@wemeetagain wemeetagain changed the title libp2p-noice handshake performance libp2p-noise handshake performance Jun 10, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Jun 20, 2022

this function takes time

Screen Shot 2022-06-20 at 10 08 20

@twoeths
Copy link
Contributor Author

twoeths commented Jun 20, 2022

@dapplion the hot path is at chaCha20Poly1305 encrypt/decrypt, not at x25519.js

Screen Shot 2022-06-20 at 10 46 21

Screen Shot 2022-06-20 at 10 47 22

@wemeetagain
Copy link
Member

looks like nonce and the DataView can be created once in the constructor and overwritten and returned each time.
not sure if its worth it given chachapoly1305 is the hot path.
Take a look at the last comment in #3465 where someone swaps in a native implementation of chachapoly1305 with our noise implementation.

@twoeths
Copy link
Contributor Author

twoeths commented Jun 21, 2022

nonceToBytes() takes > 1% of cpu time, it's all most the time we call as-sha256.digest() or uncompress gossip messages so I think it's worth to improve it if we can

Screen Shot 2022-06-21 at 09 58 25

Screen Shot 2022-06-21 at 10 05 55

right now we maintain an uint, call nonceToBytes() and increase it every time we encrypt/decrypt. We can instead maintain Uint8Array, no need to call nonceToBytes() and do the increasement manually with Uint8Array. Need to write a performance test to see if it's better

@twoeths
Copy link
Contributor Author

twoeths commented Jun 21, 2022

looks like nonce and the DataView can be created once in the constructor and overwritten and returned each time.
not sure if its worth it given chachapoly1305 is the hot path.

@wemeetagain yeah that's the best way, even better than we increase Uint8Array by 1 manually

Increase nonce from 1e6 to 2e6:

nonceToBytes x 2.20 ops/sec ±1.79% (10 runs sampled)
increaseBytes x 472 ops/sec ±0.49% (89 runs sampled)
maintain same DataView x 679 ops/sec ±0.64% (88 runs sampled)

this is the same optimization @dapplion did for ssz where he maintained both Uint8Array and DataView, in our a case we should represent a Nonce as an uint, Uint8Array and DataView

@dapplion
Copy link
Contributor

nonceToBytes() takes > 1% of cpu time, it's all most the time we call as-sha256.digest() or uncompress gossip messages so I think it's worth to improve it if we can

Agree it's worth improving it as the fix is really simple 👍

@twoeths twoeths self-assigned this Jun 22, 2022
@philknows philknows added prio-medium Resolve this some time soon (tm). scope-performance Performance issue and ideas to improve performance. labels Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-medium Resolve this some time soon (tm). scope-performance Performance issue and ideas to improve performance.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants