Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test flakes due to reusuable metric GC requirement #6211

Closed
jack-berg opened this issue Feb 8, 2024 · 3 comments · Fixed by #6221
Closed

Test flakes due to reusuable metric GC requirement #6211

jack-berg opened this issue Feb 8, 2024 · 3 comments · Fixed by #6221
Labels
Bug Something isn't working

Comments

@jack-berg
Copy link
Member

@asafm we've had a couple of test flakes around the the reusuable metric GC test. WDYT think about relaxing the GC reduction requirement?

InstrumentGarbageCollectionBenchmarkTest > normalizedAllocationRateTest() FAILED
java.lang.AssertionError: [Aggregation temporality = DELTA, testInstrumentType = DOUBLE_LAST_VALUE]
Expecting actual:
94.63418122969631
to be close to:
97.30000305175781
by less than 2.0 but difference was 2.6658218220615.
(a difference of exactly 2.0 being considered valid)
at io.opentelemetry.sdk.metrics.internal.state.InstrumentGarbageCollectionBenchmarkTest.lambda$normalizedAllocationRateTest$2(InstrumentGarbageCollectionBenchmarkTest.java:123)
at java.base/java.util.HashMap.forEach(HashMap.java:1337)
at io.opentelemetry.sdk.metrics.internal.state.InstrumentGarbageCollectionBenchmarkTest.lambda$normalizedAllocationRateTest$3(InstrumentGarbageCollectionBenchmarkTest.java:101)
at java.base/java.util.HashMap.forEach(HashMap.java:1337)
at io.opentelemetry.sdk.metrics.internal.state.InstrumentGarbageCollectionBenchmarkTest.normalizedAllocationRateTest(InstrumentGarbageCollectionBenchmarkTest.java:93)

@jack-berg jack-berg added the Bug Something isn't working label Feb 8, 2024
@jack-berg
Copy link
Member Author

I've had more trouble with these test flakes today. The last value aggregation seems to be particularly prone to non-deterministic behavior. Will need to look into the failures closer to determine if there's anything on the metrics SDK side we can to do to make the GC allocation more reliable, or whether we should relax the test constraints.

I also think we don't need to run these tests on every entry of the test matrix. Similar to how we only run code coverage on the ubuntu / java LTS entry, we should be able to only run these expensive perf tests on a single entry as well.

@asafm
Copy link
Contributor

asafm commented Feb 11, 2024

I'm starting to work on this now

asafm added a commit to streamnative/opentelemetry-java that referenced this issue Feb 11, 2024
- Adds more leniency towards tests has less options to "mess" them and lower the allocation rate and are more prone to have variable allocation rate.
- Constraints the InstrumentGarbageCollectionBenchmarkTest to run only in one specific environment out of the entire matrix
@asafm
Copy link
Contributor

asafm commented Feb 11, 2024

I couldn't recreate the issue locally after inspecting it and trying to run it with RepeatedTest.
I'm okay with increasing the offset for those 4 types of tests since the ability to refactor them by mistake is very constrained; hence, any change there will likely cause the allocation rate to go even lower than 93.7%.

I created #6221 to fix both issues mentioned: lower rate and make the test run only in one matrix option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants