Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] Add synchronous gauge #3029

Merged
merged 10 commits into from
Oct 30, 2024
Merged

Conversation

uuzay
Copy link
Contributor

@uuzay uuzay commented Aug 14, 2024

Fixes #2279

Changes

Added synchronous gauge implementation according to the specification.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@uuzay uuzay requested a review from a team August 14, 2024 16:42
Copy link

codecov bot commented Aug 14, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 87.88%. Comparing base (497eaf4) to head (56bdf77).
Report is 151 commits behind head on main.

Files with missing lines Patch % Lines
sdk/src/metrics/state/metric_collector.cc 60.00% 2 Missing ⚠️
...etry/sdk/metrics/aggregation/default_aggregation.h 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3029      +/-   ##
==========================================
+ Coverage   87.12%   87.88%   +0.76%     
==========================================
  Files         200      195       -5     
  Lines        6109     6137      +28     
==========================================
+ Hits         5322     5393      +71     
+ Misses        787      744      -43     
Files with missing lines Coverage Δ
api/include/opentelemetry/metrics/meter.h 100.00% <ø> (ø)
api/include/opentelemetry/metrics/noop.h 41.18% <ø> (ø)
...i/include/opentelemetry/metrics/sync_instruments.h 100.00% <ø> (ø)
sdk/include/opentelemetry/sdk/metrics/meter.h 57.15% <ø> (ø)
...clude/opentelemetry/sdk/metrics/sync_instruments.h 100.00% <ø> (ø)
sdk/src/metrics/meter.cc 62.36% <ø> (+0.91%) ⬆️
sdk/src/metrics/sync_instruments.cc 65.84% <ø> (ø)
...etry/sdk/metrics/aggregation/default_aggregation.h 70.13% <0.00%> (ø)
sdk/src/metrics/state/metric_collector.cc 87.10% <60.00%> (-5.76%) ⬇️

... and 94 files with indirect coverage changes

Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature.

This is not a full review yet,
only some preliminary comments.

Please fix:

  • declarations of new virtual methods using ABI version 2
  • build break issues found in CI

More extensive review to follow.

api/include/opentelemetry/metrics/meter.h Show resolved Hide resolved
@uuzay uuzay requested a review from marcalff August 20, 2024 13:16
Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature.

The code structure looks good to me overall.

I defer to @lalitb to check the logic around aggregation temporality in particular,
as I am not too familiar with this area.

sdk/include/opentelemetry/sdk/metrics/instruments.h Outdated Show resolved Hide resolved
@lalitb
Copy link
Member

lalitb commented Aug 26, 2024

@uuzay Thanks for the PR. Had a quick glance to the code, this looks similar to #2341 (correct me if not). And I believe this won't work for delta temporality. To validate, can we update the PR to test for it?


INSTANTIATE_TEST_SUITE_P(WritableMetricStorageTestDouble,
WritableMetricStorageTestFixture,
::testing::Values(AggregationTemporality::kCumulative));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the test here to include the delta temporality.

@punya
Copy link
Member

punya commented Aug 30, 2024

@uuzay Thanks for the PR. Had a quick glance to the code, this looks similar to #2341 (correct me if not). And I believe this won't work for delta temporality. To validate, can we update the PR to test for it?

If I understand the relevant section of the spec correctly, Gauges (unlike UpDownCounters and Histograms) don't support a choice of temporality.

@uuzay
Copy link
Contributor Author

uuzay commented Sep 5, 2024

I was thinking the same @punya.

It also mentions:

Gauges do not provide an aggregation semantic, instead “last sample value” is used when performing operations like temporal alignment or adjusting resolution. (source)

if that's related.

@lalitb can you please clarify if delta temporality is actually needed?

@lalitb
Copy link
Member

lalitb commented Sep 5, 2024

@lalitb can you please clarify if delta temporality is actually needed?

Sorry i was not clear enough. I meant testing for temporality with last value aggregation. The current test is only doing for cumulative.

@lalitb
Copy link
Member

lalitb commented Oct 3, 2024

@uuzay - Do you think we can modify the PR to restrict configuring the delta temporality for Last Value as of now - Just thrown an error at the startup if this is enabled through View or otherwise. With that, we should be good to go through this PR.

Thanks for working on this, and sorry about the long wait on this.

@uuzay
Copy link
Contributor Author

uuzay commented Oct 8, 2024

Hi @lalitb,
Sorry for the delays, I was both on a holiday period and also had other important tasks to take care of.

I'm planning to make the necessary changes regarding the ABI macros this week.
For delta temporality, I think what you suggested makes sense for now.

@uuzay uuzay requested a review from a team as a code owner October 15, 2024 13:58
Copy link

netlify bot commented Oct 15, 2024

Deploy Preview for opentelemetry-cpp-api-docs canceled.

Name Link
🔨 Latest commit 56bdf77
🔍 Latest deploy log https://app.netlify.com/sites/opentelemetry-cpp-api-docs/deploys/67229beb935b030008bac3ad

@uuzay
Copy link
Contributor Author

uuzay commented Oct 15, 2024

Hi @lalitb,
I'm trying to figure out where would be an appropriate place to throw the error and would appreciate some pointers.

I was thinking if SyncMetricStorage or TemporalMetricStorage constructor would be a good place.
Another idea that comes to mind is modifying DeltaTemporalitySelector in otlp_metric_utils to send a warning log and select cumulative temporality instead but I'm not sure if this covers all of the cases.

Thanks in advance.

@lalitb
Copy link
Member

lalitb commented Oct 25, 2024

Hi @lalitb, I'm trying to figure out where would be an appropriate place to throw the error and would appreciate some pointers.

I was thinking if SyncMetricStorage or TemporalMetricStorage constructor would be a good place. Another idea that comes to mind is modifying DeltaTemporalitySelector in otlp_metric_utils to send a warning log and select cumulative temporality instead but I'm not sure if this covers all of the cases.

Thanks in advance.

@uuzay Yes that should work. Another easy place could be MetricCollector::GetAggregationTemporality. This part of code is called for both periodic reader and Prometheus, so should cover all exporters, and also in future when we support Delta for Prometheus.

@marcalff
Copy link
Member

@uuzay Please merge a recent main into this PR, to get the merge conflict resolved.
This should force the CI to build also, somehow it got stuck.
Thanks.

Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for working on this.

Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the feature.

@marcalff marcalff added the ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve) label Oct 30, 2024
/**
* Record a value
*
* @param value The measurement value. May be positive, negative or zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking the merge. Seems T should be integer or float types? If so, could it be asserted explicitly like below?

static_assert(std::is_same_v<T, int64_t> || std::is_same_v<T, double>, "Gauge<T> can only be instantiated with int64_t or double");

@@ -19,6 +19,7 @@ enum class InstrumentType
kCounter,
kHistogram,
kUpDownCounter,
kGauge,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is inserting a new InstrumentType of kGauge a SDK breaking change and also break the ABI? Should it be included only for OPENTELEMETRY_ABI_VERSION_NO >= 2?

Copy link
Member

@lalitb lalitb Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't be ABI breaking for API, as InstrumentType is not part of API surface. And we don't guarantee SDK ABI compatibility. It could be breaking change for custom exporters which use the switch on this enum without handling default case, but this can be added in the changelog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been discussed before:

#3029 (comment)

The code as written is fine:

  • no change in the API, this kGauge is part of the SDK
  • the SDK ABI is changed, and we do not guarantee SDK ABI compatibility

I don't think we should document that in the changelog for the SDK, by definition every SDK change can (and most of the time does) break the SDK ABI.

@ThomsonTan Please remove the change requested then, as this blocks the merge.

Copy link
Contributor

@ThomsonTan ThomsonTan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to make sure if the adding of kGauge is ABI breaking change.

@ThomsonTan
Copy link
Contributor

Apologize for blocking the PR. I think this is the exactly the issue I raised before.

The user application builds fine with either new SDK headers/old SDK libs, or old SDK headers/new SDK libs, either one will trigger a runtime break which is very hard to investigate like memory corruption.

It is true that the SDK doesn't guarantee ABI compatibility, but we don't need break this if not necessary. In this specific case, either wrap the new kGauge in the ABI version 2 macro, or we can just append it to the end of the enum list, so we don't trigger an potential break.

Even it is documented that SDK doesn't support ABI combability, it may be ignored by common users as this actually works in most of the time. A better solution will be that we check the SDK version and SDK lib version, and raise error (like block linking) if both versions mismatch.

How to you think?

@marcalff
Copy link
Member

The user application builds fine with either new SDK headers/old SDK libs, or old SDK headers/new SDK libs, either one will trigger a runtime break which is very hard to investigate like memory corruption.

The only way for this to happen is to link dynamically with a shared library/DLL for the SDK itself, using a library from a different version compared to header files used during the application build.

Any application compiling the with the SDK header files and linking statically with the matching SDK library will be just fine.

Wrapping kGauge with ifdef causes another list of issues as well, because user code may break as well if not using the same ifdef.

I think that we can move kGauge at the end of the enum, to minimize binary compatibility issues and in practice avoid breaks for this case alone.

Move kGauge to the end, for better ABI compatibility.
@marcalff
Copy link
Member

I think that we can move kGauge at the end of the enum, to minimize binary compatibility issues and in practice avoid breaks for this case alone.

Implemented as code review comment.

@ThomsonTan PTAL.

@ThomsonTan
Copy link
Contributor

The user application builds fine with either new SDK headers/old SDK libs, or old SDK headers/new SDK libs, either one will trigger a runtime break which is very hard to investigate like memory corruption.

The only way for this to happen is to link dynamically with a shared library/DLL for the SDK itself, using a library from a different version compared to header files used during the application build.

Any application compiling the with the SDK header files and linking statically with the matching SDK library will be just fine.

Wrapping kGauge with ifdef causes another list of issues as well, because user code may break as well if not using the same ifdef.

I think that we can move kGauge at the end of the enum, to minimize binary compatibility issues and in practice avoid breaks for this case alone.

Looks good to me then, thanks.

@marcalff marcalff changed the title Add synchronous gauge [SDK] Add synchronous gauge Oct 30, 2024
@marcalff marcalff changed the title [SDK] Add synchronous gauge [API] Add synchronous gauge Oct 30, 2024
@marcalff marcalff merged commit f30ab92 into open-telemetry:main Oct 30, 2024
55 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Synchronous Gauge instrument to the metrics API/SDK
5 participants