Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: use an array for faster binding data lookup #46620

Closed
wants to merge 1 commit into from

Conversation

joyeecheung
Copy link
Member

@joyeecheung joyeecheung commented Feb 12, 2023

src: use an array for faster binding data lookup

Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

Background: I was trying to move the encoding-related bindings to a new binding with its own BindingData and noticed ~15% regression in some TextEncoder benchmarks, and the hashing showed up in the profile. With this patch the regression goes away and the hashing no longer has any presence in the profile.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/http2
  • @nodejs/net
  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Feb 12, 2023
@addaleax
Copy link
Member

but it's wasteful to spend time on hashing them at all, since we can just use the numeric type as the key

Do you know where this hashing happens in the stack/can share how you profiled this? The whole point of FastStringKey is that you don’t need to do any hashing at runtime.

@joyeecheung
Copy link
Member Author

Do you know where this hashing happens in the stack/can share how you profiled this? The whole point of FastStringKey is that you don’t need to do any hashing at runtime.

I used perf to profile a new binding that calls Environment::GetBindingData(), and the hashing just showed up on top of that binding (others frames in between are probably just inlined).

@addaleax
Copy link
Member

@joyeecheung Here’s my local assembly for http2::PackSettings:

in the fold
000000000098fba0 <node::http2::PackSettings(v8::FunctionCallbackInfo<v8::Value> const&)>:
  98fba0:       f3 0f 1e fa             endbr64 
  98fba4:       55                      push   %rbp
  98fba5:       48 89 e5                mov    %rsp,%rbp
  98fba8:       53                      push   %rbx
  98fba9:       48 89 fb                mov    %rdi,%rbx
  98fbac:       48 83 ec 08             sub    $0x8,%rsp
  98fbb0:       48 8b 07                mov    (%rdi),%rax
  98fbb3:       48 8b 78 08             mov    0x8(%rax),%rdi
  98fbb7:       e8 34 5f 1e 00          call   b75af0 <v8::Isolate::GetCurrentContext()>
  98fbbc:       31 d2                   xor    %edx,%edx
  98fbbe:       48 b9 97 8c 9a 0f 31    movabs $0x310f9a8c97,%rcx
  98fbc5:       00 00 00 
  98fbc8:       48 8b 00                mov    (%rax),%rax
  98fbcb:       48 8b 40 2f             mov    0x2f(%rax),%rax
  98fbcf:       48 8b b8 27 01 00 00    mov    0x127(%rax),%rdi
  98fbd6:       48 89 c8                mov    %rcx,%rax
  98fbd9:       48 f7 77 08             divq   0x8(%rdi)
  98fbdd:       48 89 d6                mov    %rdx,%rsi
  98fbe0:       48 8d 15 d9 23 39 04    lea    0x43923d9(%rip),%rdx        # 4d21fc0 <node::http2::Http2State::type_name>
  98fbe7:       e8 24 d1 f8 ff          call   91cd10 <std::_Hashtable<node::FastStringKey, std::pair<node::FastStringKey const, node::BaseObjectPtrImpl<node::BaseObject, false> >, std::allocator<std::pair<node::FastStringKey const, node::BaseObjectPtrImpl<node::BaseObject, false> > >, std::__detail::_Select1st, std::equal_to<node::FastStringKey>, node::FastStringKey::Hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node(unsigned long, node::FastStringKey const&, unsigned long) const>
  98fbec:       48 89 c7                mov    %rax,%rdi
  98fbef:       48 85 c0                test   %rax,%rax
  98fbf2:       74 0c                   je     98fc00 <node::http2::PackSettings(v8::FunctionCallbackInfo<v8::Value> const&)+0x60>
  98fbf4:       48 8b 38                mov    (%rax),%rdi
  98fbf7:       48 85 ff                test   %rdi,%rdi
  98fbfa:       74 04                   je     98fc00 <node::http2::PackSettings(v8::FunctionCallbackInfo<v8::Value> const&)+0x60>
  98fbfc:       48 8b 7f 20             mov    0x20(%rdi),%rdi
  98fc00:       48 8b 1b                mov    (%rbx),%rbx
  98fc03:       e8 78 94 fe ff          call   979080 <node::http2::Http2Settings::Pack(node::http2::Http2State*)>
  98fc08:       48 85 c0                test   %rax,%rax
  98fc0b:       74 13                   je     98fc20 <node::http2::PackSettings(v8::FunctionCallbackInfo<v8::Value> const&)+0x80>
  98fc0d:       48 8b 00                mov    (%rax),%rax
  98fc10:       48 89 43 18             mov    %rax,0x18(%rbx)
  98fc14:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
  98fc18:       c9                      leave  
  98fc19:       c3                      ret    
  98fc1a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  98fc20:       48 8b 43 10             mov    0x10(%rbx),%rax
  98fc24:       48 89 43 18             mov    %rax,0x18(%rbx)
  98fc28:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
  98fc2c:       c9                      leave  
  98fc2d:       c3                      ret    
  98fc2e:       66 90                   xchg   %ax,%ax

The hash (0x310f9a8c97) is hardcoded here (as it should be – it’s constexpr), so I’m wondering in what situations this would work differently?

@joyeecheung
Copy link
Member Author

joyeecheung commented Feb 12, 2023

This is the branch I used to find out the profile, by the way, revert the last two commits to go back to type_name, and use --prof/--prof-process to profile script like this:

See code
'use strict';

const encoder = new TextEncoder('utf-8');
const arr = new Uint8Array(1000);
const str = new Date().toISOString();

let result = [];
for (let i = 0; i < 1000000; ++i) {
  result = encoder.encodeInto(str, arr);
}

console.log(result);

And I'd get:

See code
 [C++]:
   ticks  total  nonlib   name
     22   11.4%   12.4%  node::builtins::BuiltinLoader::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     21   10.9%   11.8%  node::encoding_binding::BindingData::EncodeInto(v8::FunctionCallbackInfo<v8::Value> const&)
     14    7.3%    7.9%  fputc@@GLIBC_2.2.5
     12    6.2%    6.7%  __write@@GLIBC_2.2.5
      7    3.6%    3.9%  std::_Hashtable<node::FastStringKey, std::pair<node::FastStringKey const, node::BaseObjectPtrImpl<node::BaseObject, false> >, std::allocator<std::pair<node::FastStringKey const, node::BaseObjectPtrImpl<node::BaseObject, false> > >, std::__detail::_Select1st, std::equal_to<node::FastStringKey>, node::FastStringKey::Hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node(unsigned long, node::FastStringKey const&, unsigned long) const

(and some regressions, even without the AliasedBuffer change)

@joyeecheung
Copy link
Member Author

joyeecheung commented Feb 12, 2023

My guess is hard-coding is not the same as the hashing still needs to be done (e.g. it still needs to do what it does to handle collisions? so extra overhead). I didn't check out the STL implementation but now come to think of it, why do we even need a map when the keys are all just..sequential numbers (starting from 0, even)?

@joyeecheung
Copy link
Member Author

joyeecheung commented Feb 12, 2023

Updated the patch to just use an std::array. Also I think I found why the original hash was slow - there are collisions (notice how in the profile the actual function where time is spent is std::_Hashtable::_M_find_before_node which deals with collisions). Some logging showed:

See logs
AddBindingData node::process::BindingData
Bucket count 2
Bucket #0:
Bucket #1: [node::process::BindingData]
AddBindingData node::fs::BindingData
Bucket count 2
Bucket #0:
Bucket #1: [node::fs::BindingData][node::process::BindingData]
AddBindingData node::BlobBindingData
Bucket count 5
Bucket #0: [node::fs::BindingData]
Bucket #0:
Bucket #0:
Bucket #0:
Bucket #0: [node::BlobBindingData][node::process::BindingData]
AddBindingData node::encoding_binding::BindingData
Bucket count 5
Bucket #0: [node::encoding_binding::BindingData][node::fs::BindingData]
Bucket #1:
Bucket #2:
Bucket #3:
Bucket #4: [node::BlobBindingData][node::process::BindingData]

It might be possible to optimize the hash function of FastStringKey, but in the case of binding data, that's an overkill for problem as we could really just use the enums as keys and do not need to worry about collisions at all. And also because we could just use the enums as keys, we could just use an array, and that's as fast as it gets.

@joyeecheung joyeecheung changed the title src: use type_int as binding data store key src: use an array for faster binding data lookup Feb 12, 2023
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impressive, lgtm

@addaleax
Copy link
Member

It might be possible to optimize the hash function of FastStringKey

I suspect that wouldn’t make much of a difference here – the problem isn’t that the hashes collide directly, it’s that the number of buckets is on the lower end and therefore collisions are to be expected regardless of the exact hash function.

why do we even need a map

The goal of that was just to make the individual implementations more self-contained/avoid the need for global lists of all possible objects of that type.

src/env.h Outdated Show resolved Hide resolved
@joyeecheung
Copy link
Member Author

joyeecheung commented Feb 13, 2023

The goal of that was just to make the individual implementations more self-contained/avoid the need for global lists of all possible objects of that type.

Yeah although in the case of binding data, there is already a global list for the bindings, so another global list for the wrapper of the binding object probably isn't that much of a big deal anyway..eventually it might be nicer to just have C++ wrappers for all the bindings (and merge the lists), then we can move more things off the Environment object and make the global states more self-contained, like the TODO I added in #46579 about moving immediate info and timeout info to the binding, which also helps the per-realm/context built-in effort (#46558) I assume

@addaleax addaleax added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 14, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 16, 2023
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.
@joyeecheung
Copy link
Member Author

Rebased after #46556

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

joyeecheung added a commit that referenced this pull request Feb 22, 2023
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

PR-URL: #46620
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Chengzhong Wu <[email protected]>
@joyeecheung
Copy link
Member Author

Landed in f32aa19

@juanarbol juanarbol added the backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. label Mar 1, 2023
@juanarbol
Copy link
Member

Would you mind backporting this to v18.x

targos pushed a commit that referenced this pull request Mar 13, 2023
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

PR-URL: #46620
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Chengzhong Wu <[email protected]>
targos pushed a commit that referenced this pull request Nov 10, 2023
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

PR-URL: #46620
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Chengzhong Wu <[email protected]>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

PR-URL: nodejs/node#46620
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Chengzhong Wu <[email protected]>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.

PR-URL: nodejs/node#46620
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Chengzhong Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants