expose gc event as inspector's runtime event #220

vmarchaud · 2018-08-09T13:09:22Z

Hi everyone,

I would like to discuss the best way to expose metrics about the GC life cycle to the JS land through the inspector module.
As lot of knows, we can give the --trace-gc and --trace-gc-verbose to V8, it will then print some data at each GC passage. I believe it would be really useful for users (and APM vendors) to be able to access those metrics to better understand diagnostics their applications.

Anyway i'm (and @keymetrics) willing to put some time on this problem but i'm not sure what is the right way to do it. I suppose modifying V8 in deps/v8/ isn't a really good idea so i would have done it by "hijacking" the protocol session to add our own method. Then at this point just implement it in C++ with the V8 C++ API.

What you guys think of this ? Maybe someone already started to work on something similar ?

The text was updated successfully, but these errors were encountered:

gireeshpunathil · 2018-08-09T13:45:15Z

direct invocation of v8 APIs looks to be a good idea.
but how do we expose this to APMs? through JS callbacks? what would be the relevance of inspector?
slightly related: how node-report extracts heap statistics (only snapshot)

vmarchaud · 2018-08-09T13:56:52Z

I believe we could expose this as a function of the Runtime domain, so we could do something like that :

const inspector = require('inspector')
const session = new inspector.Session()
session.connect()

session.post('Runtime.enable')
session.on('Runtime.garbageCollectionStats', (stats) => {
   // we get the GC duration, type etc
})

The heap statistics is already exposed trough the v8 module so i believe we only need to get GC related metrics here

ofrobots · 2018-08-09T16:53:00Z

@vmarchaud Have you looked at the trace_events API? V8 exposes high level GC events by default and there some 'disabled-by-default' categories for lower level events that expose GC behaviour in excruciating detail. Please give it a shot, and leave feedback on how well (or not) it would work for your use-cases.

trace_events is still rough around the edges, but is a better API for this use-case IMO. If you're willing to contribute to help shape it up, that would be awesome too!

ofrobots · 2018-08-09T16:55:03Z

FYI, the trace event data is exposed through a file – or through the inspector protocol – albeit we are still working on bugs in the latter. Here's some sample code that you can use to connect to process and get Tracing data out: https://gist.github.com/ofrobots/ab4c4e68b0de4852197308cd25f3cc0e

vmarchaud · 2018-08-09T19:48:50Z

@ofrobots I knew about the trace_events API, but i didn't know it allow to get those metrics, i will try tomorrow to see what i can get with it.
I followed the different PRs about the API being toggleable at runtime, do you know if its planned to backport it to 8.x or 10.x ?

ofrobots · 2018-08-09T20:00:10Z

For back-porting, it depends. My recommendation would be to request a back-port on the specific PR for the feature/fix you would like in back-ported.

vmarchaud · 2018-08-26T09:02:38Z

After thinking about it i believe it would be easier for anyone wanting to retrieve those metrics (which are critical for every production deployment) without having to parse the trace_events stream.

Specially that from my understanding, you can't just get the stream of events continuously, you need to stop it, parse the data to get the metrics, and then re-start it to get new events.

If this is right, i believe my approach will be easier for people that just want to monitor the GC without having to understand how the tracing works IMO.

What do you think ?

AndreasMadsen · 2018-08-26T10:22:56Z

If this is right, i believe my approach will be easier for people that just want to monitor the GC without having to understand how the tracing works IMO.

The job of node.js itself, it not to make everything easy but to make everything feasible. I would prefer not to implement two ways of doing the same thing, for the sake of making things a bit easier. To take a page from "The Zen of Python".

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

I think a much more productive approach is to make trace_events more consumable. So if you have specific issues here, then we can try and solve those.

Specially that from my understanding, you can't just get the stream of events continuously, you need to stop it, parse the data to get the metrics, and then re-start it to get new events.

I'm not that familiar with the Inspector protocol for this, but for sure the file protocol is streaming. You can use https://github.com/nearform/node-trace-events-parser to consume. If the inspector protocol requires you to stop as you say, then I would definetly like to see that changed. Maybe simply implementing a .flush() method. I would suspect @ofrobots knows more about this.

vmarchaud · 2018-08-26T20:55:10Z

The job of node.js itself, it not to make everything easy but to make everything feasible. I would prefer not to implement two ways of doing the same thing, for the sake of making things a bit easier. To take a page from "The Zen of Python".

Totally understandable, i would argue that we sometimes need to provide nicer API for the inspector protocol, for example a PR is going on to simplify coverage report

However i believe if it's possible that at some point to record continuously for GC metrics in the trace_events API, then the job of parsing the stream etc, would be better provided by a 3rd party module.

PS: In another note, i would also ask the question if the current tracing categories and their output are documented (specially V8 side) ?

hashseed · 2018-08-31T07:04:31Z

I too am of the opinion that trace events are more suitable for this purpose. Fleshing out tracing output is still an on-going project, though.

vmarchaud · 2018-09-05T19:07:57Z

@hashseed Is there any on going development somewhere about flushing the tracing data ?

vmarchaud · 2019-06-18T08:39:12Z

Closing since we can receive GC event from the trace events that can be dynamically triggered at runtime

vmarchaud mentioned this issue Aug 16, 2018

"NodeTracing" domain introduction nodejs/node#20608

Merged

3 tasks

vmarchaud mentioned this issue Sep 30, 2018

trace_events: when enabling via inspector after startup, process abort nodejs/node#23185

Closed

vmarchaud closed this as completed Jun 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expose gc event as inspector's runtime event #220

expose gc event as inspector's runtime event #220

vmarchaud commented Aug 9, 2018

gireeshpunathil commented Aug 9, 2018

vmarchaud commented Aug 9, 2018

ofrobots commented Aug 9, 2018

ofrobots commented Aug 9, 2018

vmarchaud commented Aug 9, 2018

ofrobots commented Aug 9, 2018

vmarchaud commented Aug 26, 2018

AndreasMadsen commented Aug 26, 2018 •

edited

Loading

vmarchaud commented Aug 26, 2018 •

edited

Loading

hashseed commented Aug 31, 2018

vmarchaud commented Sep 5, 2018

vmarchaud commented Jun 18, 2019

expose gc event as inspector's runtime event #220

expose gc event as inspector's runtime event #220

Comments

vmarchaud commented Aug 9, 2018

gireeshpunathil commented Aug 9, 2018

vmarchaud commented Aug 9, 2018

ofrobots commented Aug 9, 2018

ofrobots commented Aug 9, 2018

vmarchaud commented Aug 9, 2018

ofrobots commented Aug 9, 2018

vmarchaud commented Aug 26, 2018

AndreasMadsen commented Aug 26, 2018 • edited Loading

vmarchaud commented Aug 26, 2018 • edited Loading

hashseed commented Aug 31, 2018

vmarchaud commented Sep 5, 2018

vmarchaud commented Jun 18, 2019

AndreasMadsen commented Aug 26, 2018 •

edited

Loading

vmarchaud commented Aug 26, 2018 •

edited

Loading