Improve Otlp Delta Aggregation with support for max and Histogram. #3749

lenin-jaganathan · 2023-04-10T07:16:32Z

This PR is a follow-up PR for #3625 where the capability to have Delta Aggregation Temporality was introduced. This aims to change some of the behaviors of the OTLP Delta Registry and try to stick the meters to the standards mentioned here(https://opentelemetry.io/docs/reference/specification/metrics/data-model/#metric-points)

Changes introduced in this PR

Core:

Abstract Timer can accept a custom histogram implementation via the constructor. (This can be applied to other registry implementations that have custom Histogram implementation and avoid duplicate code. But I have not included that as I want this to focus only on OTLP.)
Move basic DistributionStatisticConfig validations to the Builder. (Only basic validations such as non-negatives, and percentile ranges are validated.)
Extract FixedBoundaryHistogram to a package-private class under distribution. (this was an inner class in TimeWindowFixedBoundaryHistogram earlier.)
Introduce StepMax which reports the max for the step.
Introduce StepBucketHistogram which records the bucket values for the step and reset on the next step boundary.

OTLP:

OtlpStepTimer and OtlpStepDistributionSummary now extend AbstractTimer. So, that it can have custom max implementation.
Apply MeterRegistryCompatibilityKit to both DELTA and CUMULATIVE variants of OtlpMeterRegistry.
Abstract some of the common test cases of OtlpMeter Registry into a base test class.

Known Issues

~~MeterRegistryCompatibilityKit tests have some known failures that I want to solve in this PR discussion,~~

~~There was a deprecated test in MeterRegistryCompatibilityKit that validates histogram counts. Since OTLP uses Step Histogram it will return 0 for the uncompleted step which fails.~~

TODO

Add unit tests for StepMax, StepBucketHistogram, and any other classes that lack unit testing
Probably add a "Recordable" (or similar) interface that can be extended by any class that supports record() and poll() operations. The objective behind this is to have a High-level interface that defines an object that supports and recording and polling of data. This can be used to construct NOOP instances where redundant null checks can be avoided, and future support for custom recordings (min is a valid statistic in OTLP).

Notes:

Would have loved to extract the refactoring and features as multiple commits for a better reviewing experience but somehow I messed up the git history

Closes gh-3772
Closes gh-3771

...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java

...mentations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpTimer.java

sonatype-lift · 2023-04-10T07:32:25Z

🛠 Lift Auto-fix

Some of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.¹

# Download the patch
curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3749.diff -o lift-autofixes.diff

# Apply the patch with git
git apply lift-autofixes.diff

# Review the changes
git diff

Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command:

curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3749.diff | git apply

Once you're satisfied, commit and push your changes in your project.

You can preview the patch by opening the patch URL in the browser. ↩

lenin-jaganathan · 2023-04-13T16:47:19Z

@shakuzen / @jonatan-ivanov Can you guys have a look at this? Really appreciate any feedback on this. It would be good to have these get in time for 1.11.0

shakuzen · 2023-04-18T07:53:55Z

CUMULATIVE aggregation does not support max but the earlier version still paid the cost of recording max. This PR doesn't record max so it will return 0.0 if at all queried that makes the test fail

It feels a bit wider scope for this pull request than necessary. See some previous discussion in #3144. I think we should not change that behavior here. It's generally part of the contract of a timer/summary that max in some form be supported. That we don't export it currently is somewhat of a tangential issue. I would probably opt for publishing our TimeWindowMax as a separate gauge in the case of cumulative temporality. A cumulative max as specified by OTLP is not generally useful, as far as I can tell. But let's tackle any such change separately so we don't block other things and get distracted.

shakuzen · 2023-04-18T08:09:07Z

There was a deprecated test in MeterRegistryCompatibilityKit that validates histogram counts. Since OTLP uses Step Histogram it will return 0 for the uncompleted step which fails.

I'm not immediately sure what we should do about the tests, but the @Deprecated annotation was, I believe, a bad alternative to suppressing deprecation warnings for using deprecated APIs in the tests, rather than an indication the tests themselves were deprecated. We should probably rewrite the tests avoiding deprecated API if we can, but there's going to be a timing issue to get the test to pass with OTLP delta and our other Step registries.

shakuzen

Thanks for the pull request, as always. I'm leaving some initial thoughts. I'll do a more thorough review tomorrow.

micrometer-core/src/main/java/io/micrometer/core/instrument/step/StepMax.java

...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java

shakuzen · 2023-04-19T06:38:48Z

StepMaxTest, OtlpCumulativeMeterRegistryCompatibilityTest, and OtlpDeltaMeterRegistryCompatibilityTest have test failures now. I guess we still need to update the TCK code for the last one, but the first two should be passing, right?

lenin-jaganathan · 2023-04-19T07:42:38Z

I fixed StepMaxTest. OtlpCumulativeMeterRegistryCompatibilityTest fails for unavailability of max on the meter when it is cumulative which I am going to fix. But before that I wanted to see what approach we take for having CumulativeTIme and DeltaTimer vs a single Abstract timer behaving based on aggregation temporality

shakuzen · 2023-04-19T08:34:27Z

I updated the TCK code with a bit of a hack so it works with both time window and step histograms.

shakuzen · 2023-04-19T02:30:24Z

micrometer-core/src/main/java/io/micrometer/core/instrument/distribution/StepHistogram.java

+
+    @Override
+    protected CountAtBucket[] noValue() {
+        if (buckets == null)


buckets might be empty but it is never null. I wonder if it would be worth it to store an instance field with the zero'd CountAtBucket array versus making it each time noValue() is called. It depends how often noValue() will be called in practice.

In an ideal world, I don't expect "noValue" to be called during the app's lifecycle except during the start-up time. That's the reason I decided against adding an additional long-lived (actually an idle) object in there. This might quickly get concerning when there are 1000's timers with ~50 buckets.

Another thing I considered for noValue is to return an empty histogram which is already a static variable but that might not be good since the bucket information gets dropped in that case.

buckets might be empty but it is never null

That's true but except for the fact that the StepValue calls noValue() during object creation by which point buckets is not yet initialized.

Thanks for pointing that out. This isn't ideal. I left // TODO comments in the tests since we can't check the buckets that we expect to be there until a step has passed. We can try to figure this out post merge if it is worth fixing and we can come up with a solution.

micrometer-core/src/main/java/io/micrometer/core/instrument/distribution/StepHistogram.java

...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java

micrometer-core/src/main/java/io/micrometer/core/instrument/AbstractTimer.java

...crometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/AggregationTemporality.java

...rometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpDistributionSummary.java

...eter-registry-otlp/src/test/java/io/micrometer/registry/otlp/OtlpDeltaMeterRegistryTest.java

…mmary

...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java

...ations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpStepTimer.java

shakuzen

I added some tests for the StepHistogram. There are some things I will probably polish post-merge, but I think this is functionally in a good state. Thank you for all of the work on this.

shakuzen · 2023-04-28T14:35:02Z

Regarding this comment and the unresolved part of the comment thread, I've left it as something to consider outside of this pull request so we can get this merged and at least make progress to a better state overall.

lenin-jaganathan · 2023-04-28T18:11:11Z

@shakuzen Also, it is important to make the OtlpStepTimer to rotate count, total, max and histogram on reading any of these, which will be fairly less costly as we will do this only on rotation and repeated call in the same step has almost nil effect.

…gram" See micrometer-metricsgh-3749

…gram" (#3876) See gh-3749

lenin-jaganathan mentioned this pull request Apr 10, 2023

Add capability to have configurable aggregation temporality for OTLP Registry #3625

Merged

3 tasks

sonatype-lift bot reviewed Apr 10, 2023

View reviewed changes

...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java Outdated Show resolved Hide resolved

sonatype-lift bot reviewed Apr 10, 2023

View reviewed changes

...mentations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpTimer.java Outdated Show resolved Hide resolved

lenin-jaganathan force-pushed the improve_otlp_delta branch 2 times, most recently from ba901e5 to 3ede90f Compare April 11, 2023 04:48

Improve Otlp Delta Aggregation with support for max and Histogram.

ecc66dd

lenin-jaganathan force-pushed the improve_otlp_delta branch from 3ede90f to ecc66dd Compare April 12, 2023 06:22

Improve Otlp Delta Aggregation with support for max and Histogram.

27fa754

This was linked to issues Apr 17, 2023

OTLP delta histogram bucket counts are not aligned to the time window #3772

Closed

Max does not follow the specification for OTLP delta histogram #3771

Closed

lenin-jaganathan mentioned this pull request Apr 17, 2023

DeltaHistogram in SignalFx registry doesn't align with count and total #3774

Closed

shakuzen reviewed Apr 18, 2023

View reviewed changes

lenin-jaganathan force-pushed the improve_otlp_delta branch from 51c9da5 to 27fa754 Compare April 18, 2023 11:14

Move StepMax to Otlp package.

eeb91af

lenin-jaganathan force-pushed the improve_otlp_delta branch from 5aa1daa to eeb91af Compare April 19, 2023 07:41

Support step and time window histograms in TCK

d9d8666

shakuzen reviewed Apr 19, 2023

View reviewed changes

jonatan-ivanov requested changes Apr 19, 2023

View reviewed changes

lenin-jaganathan added 2 commits April 19, 2023 21:48

Bring back Step and Cumulative flavours of Timers and Distribution Su…

b446f05

…mmary

StepBucketHistogram should take supportsAggregablePercentiles

6149f6d

sonatype-lift bot reviewed Apr 26, 2023

View reviewed changes

...ns/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java Show resolved Hide resolved

sonatype-lift bot reviewed Apr 26, 2023

View reviewed changes

...ations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpStepTimer.java Show resolved Hide resolved

sonatype-lift bot reviewed Apr 26, 2023

View reviewed changes

...ations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpStepTimer.java Show resolved Hide resolved

shakuzen added 3 commits April 28, 2023 23:21

Add tests for StepHistogram

c94c5ba

Polish

6703fda

Fix formatting

56e67f7

shakuzen approved these changes Apr 28, 2023

View reviewed changes

shakuzen merged commit aaa6fc2 into micrometer-metrics:main Apr 28, 2023

lenin-jaganathan deleted the improve_otlp_delta branch April 29, 2023 06:35

lenin-jaganathan mentioned this pull request Apr 29, 2023

Improve StepBucketHistogram #3793

Merged

izeye added a commit to izeye/micrometer that referenced this pull request Jun 5, 2023

Polish "Improve Otlp Delta Aggregation with support for max and Histo…

f8f9e7a

…gram" See micrometer-metricsgh-3749

izeye mentioned this pull request Jun 5, 2023

Polish "Improve Otlp Delta Aggregation with support for max and Histogram" #3876

Merged

shakuzen pushed a commit that referenced this pull request Jun 5, 2023

Polish "Improve Otlp Delta Aggregation with support for max and Histo…

3e91e2f

…gram" (#3876) See gh-3749

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Otlp Delta Aggregation with support for max and Histogram. #3749

Improve Otlp Delta Aggregation with support for max and Histogram. #3749

lenin-jaganathan commented Apr 10, 2023 •

edited by shakuzen

Loading

sonatype-lift bot commented Apr 10, 2023

lenin-jaganathan commented Apr 13, 2023

shakuzen commented Apr 18, 2023

shakuzen commented Apr 18, 2023

shakuzen left a comment

shakuzen commented Apr 19, 2023

lenin-jaganathan commented Apr 19, 2023

shakuzen commented Apr 19, 2023

shakuzen Apr 19, 2023

lenin-jaganathan Apr 19, 2023

lenin-jaganathan Apr 19, 2023 •

edited

Loading

shakuzen Apr 28, 2023

shakuzen left a comment

shakuzen commented Apr 28, 2023

lenin-jaganathan commented Apr 28, 2023 •

edited

Loading

Improve Otlp Delta Aggregation with support for max and Histogram. #3749

Improve Otlp Delta Aggregation with support for max and Histogram. #3749

Conversation

lenin-jaganathan commented Apr 10, 2023 • edited by shakuzen Loading

Changes introduced in this PR

Core:

OTLP:

Known Issues

TODO

sonatype-lift bot commented Apr 10, 2023

🛠 Lift Auto-fix

Footnotes

lenin-jaganathan commented Apr 13, 2023

shakuzen commented Apr 18, 2023

shakuzen commented Apr 18, 2023

shakuzen left a comment

Choose a reason for hiding this comment

shakuzen commented Apr 19, 2023

lenin-jaganathan commented Apr 19, 2023

shakuzen commented Apr 19, 2023

shakuzen Apr 19, 2023

Choose a reason for hiding this comment

lenin-jaganathan Apr 19, 2023

Choose a reason for hiding this comment

lenin-jaganathan Apr 19, 2023 • edited Loading

Choose a reason for hiding this comment

shakuzen Apr 28, 2023

Choose a reason for hiding this comment

shakuzen left a comment

Choose a reason for hiding this comment

shakuzen commented Apr 28, 2023

lenin-jaganathan commented Apr 28, 2023 • edited Loading

lenin-jaganathan commented Apr 10, 2023 •

edited by shakuzen

Loading

lenin-jaganathan Apr 19, 2023 •

edited

Loading

lenin-jaganathan commented Apr 28, 2023 •

edited

Loading