Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear Inactive Objects at Checkpoint #7176

Merged
merged 1 commit into from
Nov 11, 2023

Conversation

kangyining
Copy link
Contributor

When a checkpoint is taken, the entire contents of the process is serialized, and someone can inspect the serialized state to capture some sensitive info.

Solution:
If a data is no longer referenced, the entire memory chunk is zeroed at snapshot time.

Implementation detail:
After the initial snapshot GC, 2 passes are done to fix-up, identify, and clear all unused free memory chunks (including both linked and unlinked multi-slot free entries). The first pass performs a fix-up to the current heap which converts dark matter to holes, while the second pass clears all free entries.

Note here this implementation only works for standard GC but not balanced GC and metronome.

Signed off by: [email protected]

@amicic
Copy link
Contributor

amicic commented Nov 9, 2023

Comparison with original commit

https://github.com/kangyining/omr/compare/a3f8c05f776722b0cba29ac100336a03a6698a8e..dc3a0e094352d107d9d21025978b8b3647bda724

shows the changes in ParallelGlobalGC.cpp (and some other non GC changes due different baseline), where locals MM_GCCode *gcCode and MM_GlobalGCStats *stats are now correctly used as pointers.

@kangyining kangyining changed the title Clear Inacive Objects at Checkpoint Clear Inactive Objects at Checkpoint Nov 9, 2023
/* If the delegate has isAllowUserHeapWalk set, or should prepare for a snapshot,
* fix the heap so that it can be walked
*/
MM_GCCode *gcCode = &env->_cycleState->_gcCode;
Copy link
Contributor

@amicic amicic Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually use a snapshot of gcCode (rather than a pointer to it) - we don't really change gcCode in a middle of a GC - it's just set by whoever triggered GC.
So you can declare it as:

const MM_GCCode gcCode

note that there is another user of it at around of line 571

/* If the delegate has isAllowUserHeapWalk set, or should prepare for a snapshot,
* fix the heap so that it can be walked
*/
const MM_GCCode gcCode = env->_cycleState->_gcCode;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more that you can use it at line 514 (hence move the declaration of gcCode at the start of the method)

@@ -478,7 +511,7 @@ MM_ParallelGlobalGC::mainThreadGarbageCollect(MM_EnvironmentBase *env, MM_Alloca
_collectionStatistics._tenureFragmentation |= MACRO_FRAGMENTATION;
}

mainThreadCompact(env, allocDescription, rebuildMarkBits);
mainThreadCompact(env, allocDescription, rebuildMarkBits || env->_cycleState->_gcCode.shouldClearHeap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add a comment here
while normally mark map is not required for walking the heap after a compact, the clearing callback asserts that everything that is not cleared is a marked object, so we need to rebuild a mark map for clearing

When a checkpoint is taken, the entire contents of the process is
serialized, and someone can inspect the serialized state to capture
some sensitive info.

Solution:
If a data is no longer referenced, the entire memory chunk
is zeroed at snapshot time.

Implementation detail:
After the initial snapshot GC, 2 passes are done to fix-up, identify,
and clear all unused free memory chunks (including both linked and
unlinked multi-slot free entries). The first pass performs a fix-up
to the current heap which converts dark matter to holes, while the
second pass clears all free entries.

Note here this implementation only works for standard GC but not
balanced GC and metronome.

Signed off by: [email protected]
@amicic
Copy link
Contributor

amicic commented Nov 10, 2023

@babsingh please proceed with final review

@amicic
Copy link
Contributor

amicic commented Nov 10, 2023

It's a resubmit of #7159, with a functional change how local MM_GlobalGCStats *stats was used, plus a couple of minor non-functional changes.

@babsingh
Copy link
Contributor

jenkins build all

@babsingh
Copy link
Contributor

babsingh commented Nov 11, 2023

Unrelated and known failures (#6905 and #6516) are seen in the PR builds.

@babsingh babsingh merged commit 576733c into eclipse:master Nov 11, 2023
15 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants