-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
worker: add eventLoopUtilization() #35664
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return args.GetReturnValue().Set(-1); | ||
|
||
// TODO(trevnorris): This isn't technically thread safe. | ||
double loop_start_time = w->env_->performance_state()->milestones[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key question here then would be if the loop start milestone is guaranteed to be set before user code is permitted to call LoopStartTime. I suspect that it's not. Curious what @addaleax thinks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked and it's not guaranteed. The reason this (probably) works is because it's only doing a read, and the value is only written to once. So if the state hasn't synchronized it'll return {active: 0, idle: 0, utilization: 0}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this (probably) works is because it's only doing a read, and the value is only written to once. So if the state hasn't synchronized it'll return
{active: 0, idle: 0, utilization: 0}
.
It’s not necessarily that easy – we should not assume that 64-bit reads & writes are atomic on every platform, I think.
The easiest way to work around this would probably be to only allow calling this function if the 'online'
event has been received on the Worker instance, and if not, skipping this call into C++. We could then assert here that loop_start_time > 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax 64-bit reads/writes are always atomic on 64-bit systems if the memory is aligned. So, since we support 32-bit Windows builds, I guess this isn't solved so easily. :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trevnorris That’s true for x64, but we’re also supporting arm and ppc :)
Looks good but let's work through the possible threadsafety issue on the performance mark before sign off |
I'm agree with @jasnell, sorry about the rush, haha |
a61c8d4
to
10b0775
Compare
return args.GetReturnValue().Set(-1); | ||
|
||
// TODO(trevnorris): This isn't technically thread safe. | ||
double loop_start_time = w->env_->performance_state()->milestones[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this (probably) works is because it's only doing a read, and the value is only written to once. So if the state hasn't synchronized it'll return
{active: 0, idle: 0, utilization: 0}
.
It’s not necessarily that easy – we should not assume that 64-bit reads & writes are atomic on every platform, I think.
The easiest way to work around this would probably be to only allow calling this function if the 'online'
event has been received on the Worker instance, and if not, skipping this call into C++. We could then assert here that loop_start_time > 0
.
618ced9
to
dd9d0df
Compare
@addaleax Thanks for the review. I've made fixes for everything, including only allowing to call edit: adding the is online check doesn't allow the benchmark to run. working on this. |
dd9d0df
to
6842e52
Compare
@addaleax Fix the benchmark issue. Which made me realize that because reading the ELU of a worker can't be done until after the This isn't optimal, but I can't think of a better solution for now. In the future I wanted to make reading/writing to all the milestones atomic so we could add |
I think the current approach of returning an object with
Keep in mind that even if reading a single value might be atomic, reading the entire group of timing values would not be. |
@nodejs/workers |
doc/api/worker_threads.md
Outdated
values for the worker instance. | ||
|
||
Currently the event loop utilization of a worker instance can only be retrieved | ||
after the [`'online'` event][] event has fired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add note about what happens if you call it before online
has fired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after the [`'online'` event][] event has fired. | |
after the [`'online'` event][] is emitted. |
@@ -223,6 +227,12 @@ class Worker extends EventEmitter { | |||
null, | |||
hasStdin: !!options.stdin | |||
}, transferList); | |||
// Use this to cache the Worker's loopStart value once available. | |||
this[kLoopStartTime] = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put the -1 in a const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamingr Sorry, I don't follow what you mean.
@@ -415,6 +426,50 @@ function makeResourceLimits(float64arr) { | |||
}; | |||
} | |||
|
|||
function eventLoopUtilization(util1, util2) { | |||
// TODO(trevnorris): Works to solve the thread-safe read/write issue of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for the API instead to be emitted on online
so:
- The worker emits
online
with an "stats" argument - The stats argument has this
eventLoopUtilization
method the user can call
That would make it impossible to use this API incorrectly and remove a bit of the weirdness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like it makes accessing the API more difficult, with little benefit…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax
Well, the benefit is that it would be impossible to misuse the API. Currently this API has a (relatively small) footgun if you call eventLoopUtilization
before online
is fired. That eliminates the footgun.
Note that the API isn't really that bad:
const worker = new Worker(...);
const controller = await once(worker, 'online');
console.log(controller.eventLoopUtilization()); // same thing as currently on worker now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamingr I think my disagreement is more about whether this is a footgun or not :) If you call it before the 'online'
event, you don’t get inaccurate data, you get no ELU data – which is accurate, because (up to race conditions) there actually is no ELU data available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think the distinction between "no data" and "0 data" here isn't important - I will concede.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamingr I plan on figuring out a way to remove the limitation of not being able to query the ELU until the 'online'
event has been emitted.
// has had a chance to turn. So it will be impossible to read the ELU of | ||
// a worker thread immediately after it's been created. | ||
if (!this[kIsOnline]) { | ||
return { idle: 0, active: 0, utilization: 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep this API, I really prefer it if we didn't return 0
s to indicate "not yet online" here and instead throw an error. I can see users running into footguns here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently agrees with the behavior on the main thread. I think throwing an error would be the (way) more unexpected result.
lib/internal/worker.js
Outdated
} | ||
|
||
// Cache loopStart, since it's only written to once. | ||
if (this[kLoopStartTime] <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=== -1 (rather than <=0) ? (also, const?)
Also, I did not actually check out the branch and play with it so I didn't LGTM as this already has enough approvals - but if that would help/is needed here I am happy to do that. To be clear (eventhough thankfully our process doesn't require this anymore): none of my comments are blocking and I am +1 on these changes in general. |
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#35664
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37163
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: #35664 Backport-PR-URL: #37163 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]>
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: #35891 Fixes: #35844 Refs: #35886 Refs: #35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: #37163
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: nodejs#35664 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]> Backport-PR-URL: nodejs#37165
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: nodejs#35891 Fixes: nodejs#35844 Refs: nodejs#35886 Refs: nodejs#35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: nodejs#37165
Allow calling eventLoopUtilization() directly on a worker thread: const worker = new Worker('./foo.js'); const elu = worker.performance.eventLoopUtilization(); setTimeout(() => { worker.performance.eventLoopUtilization(elu); }, 10); Add a new performance object on the Worker instance that will hopefully one day hold all the other performance metrics, such as nodeTiming. Include benchmarks and tests. PR-URL: #35664 Backport-PR-URL: #37163 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gerhard Stöbich <[email protected]> Reviewed-By: James M Snell <[email protected]>
The active worker check compared the time from sending message till response arrived from worker with the complete time the worker was running till it responses to the spin request. If sending back the message is slow for some reason the test fails. Adapt the test to compare the time seen inside the worker with the time read from main thread. PR-URL: #35891 Fixes: #35844 Refs: #35886 Refs: #35664 Reviewed-By: Gireesh Punathil <[email protected]> Reviewed-By: Rich Trott <[email protected]> Backport-PR-URL: #37163
NOTE: While this is working as it should, I feel the implementation may be a bit clunky. Specifically the one TODO about thread-safe access to a performance mark. Looking for feedback on how this could be improved.
worker: add eventLoopUtilization()
Allow calling eventLoopUtilization() directly on a worker thread:
Add a new performance object on the Worker instance that will hopefully
one day hold all the other performance metrics, such as nodeTiming.
Include benchmarks and tests.
Also include a commit that improves tests for
perf_hooks.performance.eventLoopUtilization()
.Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes