Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use stored worker names #181

Merged
merged 4 commits into from
Nov 30, 2021

Conversation

NelsonVides
Copy link
Collaborator

@NelsonVides NelsonVides commented Nov 25, 2021

TL;DR: Using stored names reduces garbage by three and increases the performance (of finding the worker) by 4.

This is achieved by storing the worker names on an ets table with keys {SupervisorName, Index} and values the final atom representing the name, instead of calculating this atom name again and again. I suspect this might also have somehow better concurrency characteristics, as generating the name generates atoms, and therefore needs to grab locks on the atom table, which is not optimised for writes, even if the write is idempotent as the atom already exists, but this one is mostly guessing. Sequential performance is surely much better 🙂

Not sure of the preferred conventions in this project, I'm all open for changes, mostly I'm proposing the idea more than the exact code 😇


Operating System: Ubuntu 20.04.3 in server mode
CPU Information: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
Number of Available Cores: 12
Available memory: 30.99 GB
Erlang 24.1.7

Running on two separate, freshly started VMs, the command fprof:apply(fun wpool_bench:run_tasks/3, [[{small, 100}], best_worker, []]). on both main and on this branch, bears the following results:

Using worker names from upstream:

{[{{wpool_pool,min_message_queue,1},            100,  276.050,    0.576},      
  {{wpool_pool,min_message_queue,3},           10000,    0.000,   54.851}],     
 { {wpool_pool,min_message_queue,3},           10100,  276.050,   55.427},     %
 [{{wpool_pool,worker_name,2},                 10000,  127.355,   67.331},      
  {{wpool_pool,next_wpool,1},                  10000,   35.003,   24.012},      
  {{wpool_pool,queue_length,1},                10000,   34.806,   22.586},      
  {{lists,min,1},                               100,   12.141,    0.138},      
  {{erlang,whereis,1},                         10000,   11.138,   10.929},      
  {garbage_collect,                              53,    0.180,    0.180},      
  {{wpool_pool,min_message_queue,3},           10000,    0.000,   54.851}]}.

{[{{erlang,'++',2},                             261,    0.881,    0.881},      
  {{erlang,list_to_atom,1},                     132,    0.382,    0.382},      
  {{erlang,integer_to_list,1},                  107,    0.366,    0.366},      
  {{erlang,atom_to_list,1},                     140,    0.350,    0.350},      
  {{erlang,process_info,2},                     103,    0.287,    0.287},      
  {{erlang,whereis,1},                           75,    0.232,    0.232},      
  {{wpool_pool,worker_name,2},                   72,    0.181,    0.181},      
  {{wpool_pool,min_message_queue,3},             53,    0.180,    0.180},      
  {{gen_server,call,3},                          18,    0.092,    0.092},      
  {{erlang,setelement,3},                        35,    0.091,    0.091},      
  {{io_lib_format_ryu_table,pow5_bitcount,0},     1,    0.030,    0.030},      
  {{io_lib_format,general_case_10,5},             3,    0.026,    0.026},      
  {{ets,lookup,2},                                4,    0.017,    0.017},      
  {{wpool_pool,next_wpool,1},                     5,    0.009,    0.009},      
  {{gen,call,4},                                  3,    0.008,    0.008},      
  {{rand,exsss_uniform,2},                        2,    0.007,    0.007},      
  {{wpool_bench,run_task,3},                      2,    0.006,    0.006},      
  {{erlang,demonitor,2},                          1,    0.006,    0.006},      
  {{lists,keyfind,3},                             1,    0.005,    0.005},      
  {{io_lib_format,compute_shortest,6},            1,    0.005,    0.005},      
  {{io_lib_format,collect_cseq,2},                1,    0.005,    0.005},      
  {{rand,uniform,1},                              1,    0.003,    0.003},      
  {{lists,reverse,2},                             2,    0.002,    0.002},      
  {{io_lib_pretty,pp_tail,10},                    2,    0.002,    0.002},      
  {{io_lib_format,mulShiftAll,4},                 2,    0.002,    0.002},      
  {{wpool_sup,stop_pool,1},                       1,    0.001,    0.001},      
  {{lists,do_flatten,2},                          1,    0.001,    0.001},      
  {{io_lib_format_ryu_table,value,1},             1,    0.001,    0.001},      
  {{io_lib_format,handle_normal_output_mod,3},    1,    0.001,    0.001},      
  {{io_lib_format,bounds,6},                      1,    0.001,    0.001},      
  {{gen,do_call,4},                               1,    0.001,    0.001}],     
 { garbage_collect,                            1033,    3.181,    3.181},     %
 [ ]}.

Using stored worker names

{[{{wpool_pool,min_message_queue,1},            100,  201.402,    0.675},      
  {{wpool_pool,min_message_queue,3},           10000,    0.000,   62.036}],     
 { {wpool_pool,min_message_queue,3},           10100,  201.402,   62.711},     %
 [{{wpool_pool,worker_name,2},                 10000,   38.456,   25.041},      
  {{wpool_pool,queue_length,1},                10000,   37.420,   24.245},      
  {{wpool_pool,next_wpool,1},                  10000,   36.806,   24.847},      
  {{lists,min,1},                               100,   13.245,    0.152},      
  {{erlang,whereis,1},                         10000,   12.573,   12.311},      
  {garbage_collect,                              33,    0.191,    0.191},      
  {{wpool_pool,min_message_queue,3},           10000,    0.000,   62.036}]}.

{[{{erlang,whereis,1},                           67,    0.269,    0.269},      
  {{erlang,process_info,2},                      46,    0.248,    0.248},      
  {{erlang,setelement,3},                        48,    0.231,    0.231},      
  {{wpool_pool,min_message_queue,3},             33,    0.191,    0.191},      
  {{ets,lookup,2},                               49,    0.186,    0.186},      
  {{gen_server,call,3},                          14,    0.073,    0.073},      
  {{wpool_pool,worker_name,2},                   21,    0.071,    0.071},      
  {{wpool_pool,next_wpool,1},                     9,    0.036,    0.036},      
  {{erlang,integer_to_list,1},                    4,    0.021,    0.021},      
  {{io_lib_format,general_case_10,5},             4,    0.018,    0.018},      
  {{io_lib_pretty,pp_tail,10},                    2,    0.010,    0.010},      
  {{io_lib_format_ryu_table,value,1},             1,    0.009,    0.009},      
  {{erlang,'++',2},                               2,    0.007,    0.007},      
  {{string,uppercase_list,2},                     1,    0.006,    0.006},      
  {{lists,reverse,2},                             1,    0.006,    0.006},      
  {{io_lib_format,compute_shortest,6},            1,    0.006,    0.006},      
  {{io_lib_format,convert_to_decimal,3},          1,    0.004,    0.004},      
  {{io_lib_format,collect_cc,2},                  1,    0.004,    0.004},      
  {{gen,do_call,4},                               1,    0.004,    0.004},      
  {{io_lib_format,mulShiftAll,4},                 1,    0.003,    0.003},      
  {{io_lib_format,mulShift64,3},                  1,    0.003,    0.003},      
  {{gen,call,4},                                  1,    0.003,    0.003},      
  {{lists,do_flatten,2},                          1,    0.001,    0.001},      
  {{io_lib_format,count_small,2},                 1,    0.001,    0.001}],     
 { garbage_collect,                             311,    1.411,    1.411},     %
 [ ]}.

Also running wpool_bench:run_tasks([{small, 10000}], best_worker, []). on a VM spawned as ERL_FLAGS="+JPperf true +S 1" rebar3 shell and then running

> perf record -k mono --call-graph lbr -o /tmp/perf.data -p $(pgrep beam) -- sleep 10`
> perf inject --jit -i /tmp/perf.data -o /tmp/perf.data.jit
> perf report -i /tmp/perf.data

Returns 51% of time on worker_pool:worker_name/2 for main, and 19% of time for the same function in this branch.

main
image

stored
image

Attached the two perf reports for both runs, with flame graphs attached: wpool_perf_flamegraphs.zip

@codecov
Copy link

codecov bot commented Nov 25, 2021

Codecov Report

Merging #181 (bb5c59d) into main (e8f74d3) will increase coverage by 0.09%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #181      +/-   ##
==========================================
+ Coverage   92.98%   93.08%   +0.09%     
==========================================
  Files          10       10              
  Lines         442      448       +6     
==========================================
+ Hits          411      417       +6     
  Misses         31       31              
Impacted Files Coverage Δ
src/wpool_pool.erl 96.73% <100.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e8f74d3...bb5c59d. Read the comment docs.

@elbrujohalcon
Copy link
Member

This looks really nice.
Two questions:

  1. Can you identify the code that is not covered by tests now? Maybe it's worth removing it.
  2. Would it make sense to use persistent_term instead of ets? (We didn't have that functionality back then when I created this library)

@NelsonVides
Copy link
Collaborator Author

This looks really nice. Two questions:

1. Can you identify the code that is not covered by tests now? Maybe it's worth removing it.

2. Would it make sense to use `persistent_term` instead of `ets`? (We didn't have that functionality back then when I created this library)
  1. It's line 216 in https://github.com/NelsonVides/worker_pool/blob/3998f5c2e50713346e058200f9590accec77f577/src/wpool_pool.erl#L213-L219, which means, that now workers are always found. Interestingly, this line was hit only once for the main branch, and zero times now. Not sure what to do about that one 🤔

  2. persistent_term, that's an interesting question. I'd say that correctly using the persistent_term is tricky. It would make much harder to implement adding and removing workers, because modifying the persistent_term table at runtime could wreak havoc — though dynamic workers aren't supported yet anyway. Also it would mean that this library wouldn't support older erlang versions (currently only testing from 23, but, does it already not work with 22 or is it only a testing simplification?). I think I have some ideas on how to do the persistent-term trick, maybe making it configurable to use persistent-terms or ets tables, but maybe that can all be a different PR with some more discussions on persistent-term how-tos?

@elbrujohalcon
Copy link
Member

Let's see…

  1. There was probably a test for that… But I don't know which one was off-the-top-of-my-head. We'll need to check a bit further.
  2. Not supporting old versions is how we move forward with this library… That's not an issue. I think we currently only support 23+. That's why there is a special branch for OTP21 and we don't do any back-compat tricks here. Making it configurable would be great, yes!

When we use persistent_term at NextRoll… what we do is, in general, store a map as the term. In this case, it would be one per sup with stuff like #{1 => …, 2 => …, I guess.

@NelsonVides
Copy link
Collaborator Author

Wow, there's a case of a run passing all tests for OTP24 but not for OTP23 😳 (not mentioning that it all passes locally for me)

@NelsonVides
Copy link
Collaborator Author

@NelsonVides
Copy link
Collaborator Author

@elbrujohalcon this seems to be fixed, can you take over from here? 😉

Copy link
Member

@elbrujohalcon elbrujohalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this one and add a ticket to try using persistent_term for ?WPOOL_WORKERS.

@elbrujohalcon elbrujohalcon merged commit dcae5ca into inaka:main Nov 30, 2021
@NelsonVides NelsonVides deleted the use_stored_worker_names branch November 30, 2021 10:29
@NelsonVides
Copy link
Collaborator Author

Excellent, I'll find some spare time to do some prototyping on the persistent_term topic, let me know if you come up with anything in the meantime 💪🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants