Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to streaming out the heap snapshot data #1

Merged

Conversation

JianFangAtRai
Copy link
Owner

dummy PR, do not merge. thanks

NHDaly and others added 16 commits September 29, 2023 13:12
This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading,
or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before
downloading them.
This way you can always recover from an OOM
node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an
optimization.... It seems like we should probably prefer to inline the
Edges right into the vector, rather than having to do another random
lookup into the edges table?
@JianFangAtRai JianFangAtRai marked this pull request as draft January 11, 2024 02:27
@JianFangAtRai JianFangAtRai self-assigned this Jan 11, 2024
@JianFangAtRai JianFangAtRai marked this pull request as ready for review January 11, 2024 02:28
@JianFangAtRai JianFangAtRai changed the base branch from master to jfang/master-base-for-streaming January 11, 2024 02:31
@JianFangAtRai JianFangAtRai changed the title Dummy PR to merge the streaming functionality into master Change to streaming out the heap snapshot data Jan 11, 2024
@JianFangAtRai JianFangAtRai merged commit aa48cb0 into jfang/master-base-for-streaming Jan 11, 2024
4 checks passed
@JianFangAtRai JianFangAtRai deleted the jfang/heapsnapshot-streaming-merge branch January 11, 2024 02:35
JianFangAtRai added a commit that referenced this pull request Jan 11, 2024
* Streaming the heap snapshot!

This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading,
or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before
downloading them.

* Allow custom filename; use original API

* Support legacy heap snapshot interface. Add reassembly function.

* Add tests

* Apply suggestions from code review

* Update src/gc-heap-snapshot.cpp

* Change to always save the parts in the same directory

This way you can always recover from an OOM

* Fix bug in reassembler: from_node and to_node were in the wrong order

* Fix correctness mistake: The edges have to be reordered according to the
node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an
optimization.... It seems like we should probably prefer to inline the
Edges right into the vector, rather than having to do another random
lookup into the edges table?

* Debugging messed up edge array idxs

* Disable log message

* Write the .nodes and .edges as binary data

* Remove unnecessary logging

* fix merge issues

* attempt to add back the orphan node checking logic

---------

Co-authored-by: Nathan Daly <[email protected]>
Co-authored-by: Nathan Daly <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants