Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenJDK java/lang/runtime/ReferencedKeyTest java.lang.RuntimeException: missing key #17910

Closed
pshipton opened this issue Aug 3, 2023 · 24 comments

Comments

@pshipton
Copy link
Member

pshipton commented Aug 3, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/24/
jdk_lang_0, jdk_lang_j9_0
java/lang/runtime/ReferencedKeyTest.java

20:31:38  java.lang.RuntimeException: missing key
20:31:38  	at java.base/java.lang.runtime.ReferencedKeyTest.assertTrue(ReferencedKeyTest.java:52)
20:31:38  	at java.base/java.lang.runtime.ReferencedKeyTest.methods(ReferencedKeyTest.java:66)
20:31:38  	at java.base/java.lang.runtime.ReferencedKeyTest.mapTest(ReferencedKeyTest.java:62)
20:31:38  	at java.base/java.lang.runtime.ReferencedKeyTest.main(ReferencedKeyTest.java:46)
@pshipton
Copy link
Member Author

pshipton commented Aug 4, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/26/
jdk_lang_1
java/lang/runtime/ReferencedKeyTest.java

23:58:10  java.lang.RuntimeException: missing key
23:58:10  	at java.base/java.lang.runtime.ReferencedKeyTest.assertTrue(ReferencedKeyTest.java:52)
23:58:10  	at java.base/java.lang.runtime.ReferencedKeyTest.methods(ReferencedKeyTest.java:66)
23:58:10  	at java.base/java.lang.runtime.ReferencedKeyTest.mapTest(ReferencedKeyTest.java:62)
23:58:10  	at java.base/java.lang.runtime.ReferencedKeyTest.main(ReferencedKeyTest.java:44)

@pshipton
Copy link
Member Author

pshipton commented Aug 4, 2023

5 x 100x grinder (stops on failure) https://openj9-jenkins.osuosl.org/job/Grinder/2646/
Failed in every child. # iterations to failure: 21, 39, 83, 10, 20

again with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2650/
Failed in every child. #iterations to failure: 9, 21, 31, 21, 13

again on xmac with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2652/ - passed
other linux platforms with -Xint https://openj9-jenkins.osuosl.org/job/Grinder/2654/ - passed alinux, plinux, xlinux (ignoring infra failure)

@pshipton
Copy link
Member Author

pshipton commented Aug 9, 2023

@TobiAjila fyi

@pshipton
Copy link
Member Author

pshipton commented Aug 9, 2023

Try to get javacore, system core on failure.
-Xdump:java+system:events=throw,filter=java/lang/RuntimeException,request=exclusive+prepwalk+preempt
1 machine https://openj9-jenkins.osuosl.org/job/Grinder/2659/ - passed
3 machines https://openj9-jenkins.osuosl.org/job/Grinder/2660/ - reproduced the failure in https://openj9-jenkins.osuosl.org/job/Grinder_iteration_2/298/ iteration 89

Artifacts with javacore, core
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_2/298/openjdk_test_output.tar.gz

@tajila
Copy link
Contributor

tajila commented Aug 11, 2023

The RefenceKeyMap has twice as many entries as it should have.

> !j9object 0x000000002C6513C0
!J9Object 0x000000002C6513C0 {
	struct J9Class* clazz = !j9class 0x24E800 // java/lang/runtime/ReferencedKeyMap
	Object flags = 0x00000000;
	I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
	Z isSoft = 0x00000000 (offset = 12) (java/lang/runtime/ReferencedKeyMap)
	Ljava/util/Map; map = !fj9object 0x2c6513d8 (offset = 4) (java/lang/runtime/ReferencedKeyMap)
	Ljava/lang/ref/ReferenceQueue; stale = !fj9object 0x2c651400 (offset = 8) (java/lang/runtime/ReferencedKeyMap)
}
> !fj9object 0x2c6513d8
!J9Object 0x000000002C6513D8 {
	struct J9Class* clazz = !j9class 0xCF000 // java/util/HashMap
	Object flags = 0x00000000;
	I lockword = 0x00000000 (offset = 0) (java/lang/Object) <hidden>
	Ljava/util/Set; keySet = !fj9object 0x0 (offset = 4) (java/util/AbstractMap)
	Ljava/util/Collection; values = !fj9object 0x0 (offset = 8) (java/util/AbstractMap)
	[Ljava/util/HashMap$Node; table = !fj9object 0x2c650ac0 (offset = 12) (java/util/HashMap)
	Ljava/util/Set; entrySet = !fj9object 0x0 (offset = 16) (java/util/HashMap)
	I size = 0x00000034 (offset = 20) (java/util/HashMap)
	I modCount = 0x00000034 (offset = 24) (java/util/HashMap) //52 but should be 26
	I threshold = 0x00000060 (offset = 28) (java/util/HashMap)
	F loadFactor = 0x3F400000 (offset = 32) (java/util/HashMap)
}

The API relies on the ReferenceQueue; stale ref queue to clean up stale entries. The queue is empty but somehow there are still stale entries.

    public void removeStaleReferences() {
        while (true) {
            Object key = stale.poll();
            if (key == null) {
                break;
            }
            map.remove(key);
        }
    }

@tajila
Copy link
Contributor

tajila commented Aug 14, 2023

In a successful run 52 items must be in the refQueue (stale)

@tajila
Copy link
Contributor

tajila commented Aug 14, 2023

The test relies on

System.gc();
        sleep();

To force a GC and add objects to refqueues, which is not guaranteed on J9.

@tajila
Copy link
Contributor

tajila commented Aug 14, 2023

@dmitripivkine are objects added to associated reference queues on every GC?

@dmitripivkine
Copy link
Contributor

yes, but Local GC might address references from Nursery only.

@tajila
Copy link
Contributor

tajila commented Aug 15, 2023

Okay so we would need

System.gc();
System.gc();

to trigger a global GC?

@dmitripivkine
Copy link
Contributor

yes, I think so.

@pshipton
Copy link
Member Author

As this is apparently a test problem (limitation of OpenJ9?) rather than a functional problem, removing the blocker label.

tajila added a commit to tajila/openj9-openjdk-jdk21 that referenced this issue Aug 28, 2023
@tajila
Copy link
Contributor

tajila commented Aug 28, 2023

This seems to be a similar problem to #17723

I noticed

> !j9javavm 0x000003FF9801EB20 | grep processReference
	0x16e8: struct J9ThreadMonitor* processReferenceMonitor = !j9threadmonitor 0x000003FF98009CE8
	0x16f0: UDATA processReferenceActive = 0x0000000000000000 (0)

@tajila
Copy link
Contributor

tajila commented Aug 28, 2023

Probably explains why adding more GCs doesnt solve it

@tajila
Copy link
Contributor

tajila commented Aug 28, 2023

@dmitripivkine Actually the vm->processReferenceActive represents the number of references active being processed. It doesnt represent the references pending processing.

I think we need finalizeListManager->getReferenceCount() to understand this situation better

@tajila
Copy link
Contributor

tajila commented Aug 28, 2023

> !gc_finalizelistmanager 0x000003FF9807E410
GC_FinalizeListManager at 0x3ff9807e410 {
  Fields for MM_Base:
  Fields for MM_BaseVirtual:
	0x8: const U8* _typeId = !j9x 0x000003FF9DA422CC // "GC_FinalizeListManager"
  Fields for GC_FinalizeListManager:
	0x10: class MM_GCExtensions* _extensions = !mm_gcextensions 0x000003FF98036B80
	0x18: struct J9ThreadMonitor* _mutex = !j9threadmonitor 0x000003FF98055AC8
	0x20: struct J9Object* _systemFinalizableObjects = !j9object 0x0000000000000000
	0x28: UDATA _systemFinalizableObjectCount = 0x0000000000000000 (0)
	0x30: struct J9Object* _defaultFinalizableObjects = !j9object 0x0000000000000000
	0x38: UDATA _defaultFinalizableObjectCount = 0x0000000000000000 (0)
	0x40: struct J9Object* _referenceObjects = !j9object 0x000000002C651340 // jdk/internal/ref/CleanerImpl$PhantomCleanableRef
	0x48: UDATA _referenceObjectCount = 0x00000000000000A6 (166) //<--this is the real value
	0x50: struct J9ClassLoader* _classLoaders = !j9classloader 0x0000000000000000
	0x58: UDATA _classLoaderCount = 0x0000000000000000 (0)
}

@tajila
Copy link
Contributor

tajila commented Aug 28, 2023

After taking a brief look at static int J9THREAD_PROC FinalizeMainThread(void *javaVM) it seems like we only consume one job at a time. I think we need a way to force the finalizer thread to empty the queue (or a certain amount).

@dmitripivkine
Copy link
Contributor

After taking a brief look at static int J9THREAD_PROC FinalizeMainThread(void *javaVM) it seems like we only consume one job at a time. I think we need a way to force the finalizer thread to empty the queue (or a certain amount).

There is runFinalization() implementation expected to run Finalization non-stop until queue is empty:

runFinalization(J9VMThread *vmThread)

Is it kind of service you are looking?

tajila added a commit to tajila/openj9-openjdk-jdk21 that referenced this issue Aug 29, 2023
@tajila
Copy link
Contributor

tajila commented Aug 30, 2023

Is it kind of service you are looking?

Yes, it fixed the issue I described above. Ill see if it can be used here, #17723.

@tajila
Copy link
Contributor

tajila commented Sep 5, 2023

If we dont see this issue again in the next couple of days we can close it now that the System.gc() changes are in.

@pshipton
Copy link
Member Author

pshipton commented Sep 5, 2023

The System.gc() changes are #18044

@pshipton
Copy link
Member Author

pshipton commented Sep 5, 2023

Not seen lately, closing for now.

@pshipton pshipton closed this as completed Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants