Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MetricProducer to allow sdk.MeterProviders to incorporate metric data from third-party sources #2722

Closed
wants to merge 13 commits into from

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Aug 10, 2022

Part of #1175

Related to #2730

Required for #2732

Changes

Add a MetricProducer interface to the metrics SDK. It is an optional argument when creating a MetricReader meant to support non-SDK sources of metric data.

Metric data from a MetricProducer is not subject to the MeterProviders or MetricReaders configuration for instruments. This means Views on MeterProviders, and default Aggregations and Aggregation Temporalities configured on MetricReaders are not applied to data from MetricProducers.

This is needed to support an OpenCensus metric bridge, which is proposed separately. It could also be used to support other bridges as well, such as a Prometheus bridge.

Discussion from the spec sig on 8/30:

@dashpole presented high level design options:

  1. Bridge incorporated into MeterProvider
  2. Bridge separate from MeterProvider, and works with a MetricReader
  3. Bridge separate from MeterProvider and MetricReader, works with MetricExporter

3 was eliminated because it won't work easily with pull-based exporters. The overall sentiment was to stick with option 1, because then resource and exporter configuration for the MeterProvider would be shared. See #2722 (comment) for more details.

@dashpole dashpole changed the title OpenCensus Metrics bridge using MetricProducer Add MetricProducer to allow MetricReaders to collect from third-party metric sources Aug 16, 2022
@dashpole dashpole force-pushed the metric_producer_bridge branch 2 times, most recently from 42b8ad0 to 170351b Compare August 16, 2022 19:23
@dashpole dashpole mentioned this pull request Aug 16, 2022
7 tasks
@dashpole dashpole marked this pull request as ready for review August 16, 2022 19:30
@dashpole dashpole requested review from a team August 16, 2022 19:30
Copy link
Member

@pirgeo pirgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that OpenCensus is currently the only use case (but please correct me if I am wrong). I am not familiar with OpenCensus, but I know from Micrometer that there are definitely Metrics APIs that allow plugging a different SDK. I wonder if it is possible to wrap the OpenTelemetry instruments in the OpenCensus instruments so that their usage doesn't change, but they would use the OTel SDK nonetheless.

specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Show resolved Hide resolved
@dashpole
Copy link
Contributor Author

I wonder if it is possible to wrap the OpenTelemetry instruments in the OpenCensus instruments so that their usage doesn't change, but they would use the OTel SDK nonetheless.

Unfortunately, I don't think that is feasible for OpenCensus. The instrumentation API for OpenCensus is quite different from OTel's. You just call Record(number), and then decide what instrument to use in the OpenCensus View. The other challenge is that you can't substitute the SDK in OpenCensus with a different implementation (e.g. OTel).

Although I haven't prototyped it out, I believe this model (bridge implements MetricProducer) would also work well for a prometheus bridge. In Go, I can Gather metrics from a prometheus registry (~SDK), and could convert to an OpenTelemetry batch of metrics relatively easily.

@jmacd
Copy link
Contributor

jmacd commented Aug 17, 2022

As I read this PR, a MetricProducer could be bound to just one MetricExporter. Does the MetricProducer user lose access to multi-exporter support?

Thinking about how I'd like to see a bridge to OpenCensus (or Prometheus), I'd like to see a single SDK (with multiple Meters) co-exist with MetricProducers so that a single SDK configuration could include bridged-metrics AND there would be just one export call across the whole SDK+bridges, one gRPC connection, etc.

To me, this suggests integrating a different sort of Producer than the one specified here, which essentially looks like a replacement of the SDK. Instead, can we replace a Meter with a producer for a single Scope? In addition to bridging metrics from OpenCensus and Prometheus, this sort of back-door allows users to work around all the current limitations in OpenTelemetry API. For example, asynchronous histogram does not exist? Just create a MetricProducer and emit some histogram points (e.g., I might use a single-scope MetricProducer to output https://pkg.go.dev/runtime/metrics (issue).

@dashpole
Copy link
Contributor Author

As I read this PR, a MetricProducer could be bound to just one MetricExporter. Does the MetricProducer user lose access to multi-exporter support?

That wasn't my intention. A MetricReader can have a single MetricProducer, and a single MetricExporter. But, like a MeterProvider can have multiple MetricReaders, a MetricProducer can be registered with many MetricReaders, each with an exporter. Let me know if I can make that clearer.

I'll chew on the other feedback and see what I can come up with.

specification/metrics/sdk.md Outdated Show resolved Hide resolved
specification/metrics/sdk.md Outdated Show resolved Hide resolved
@dashpole dashpole requested a review from a team August 18, 2022 20:06
@dashpole
Copy link
Contributor Author

@jmacd re: #2722 (comment)

I think it should be possible to achieve your first ask. Instead of registering a MetricProducer with a MetricReader, we could pass a MetricProducer to a MeterProvider. The MeterProvider would add the bridge's metrics to metrics from OTel instruments' metrics, and provide a single batch of metrics to exporters. This would have the added benefit of simplifying the user experience, since they don't need to create multiple readers (one for the SDK, one for the bridge).

For the second section, I think the fundamental ask is to reduce the scope of the interface from a full batch of metrics (i.e. one ResourceMetrics in Go) to metrics for a single scope (i.e. []Metrics in Go). Doing that would also simplify the user experience, since the Resource used by the MeterProvider will be automatically used for bridged metrics, rather than needing to be supplied separately.

I've made those updates in the latest commit.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few things:

  • This PR title should be updated that this works on the sdk.MeterProvider not on the api, because we don't want to expose our data model implementation to the API layer.
  • For me it does not make too much sense to have MeterProvider support additional MetricProducers. I think I see MeterProvider as any other MetricProducer, so the MetricReader that we have should be able to read from any MetricProducer including our MeterProvider. Then the MetricProducer becomes the interface that any source (meter provider is one) can implement to connect with the export pipeline lead by the MetricReader.

If this was discussed, please document in the PR description for new reviewers to understand.

@dashpole dashpole changed the title Add MetricProducer to allow MeterProviders to incorporate metric data from third-party sources Add MetricProducer to allow sdk.MeterProviders to incorporate metric data from third-party sources Sep 19, 2022
@dashpole
Copy link
Contributor Author

Thanks @bogdandrutu. I updated the title. I also pasted the notes from the 8/30 discussion in the PR description where we discussed the high-level design.

@bogdandrutu
Copy link
Member

My vote is to support Option 2:

  1. Hard to explain that views are not there. We clearly separate that views are only for sdk.MeterProvider.
  2. exporter configuration for the MeterProvider would be shared.
    • This is true for the option 2 as well, since the MetricReader may be shared, isn't it?
  3. MeterProvider will implement the same interface as any other producer.
  4. Indeed user must configure Resource maybe twice, but that is easier in my opinion compare with explaining that the resource will be overwritten. (see "Resource MUST be provided by the MeterProvider, and any resource information provided by the bridge MUST be overridden.")

@dashpole
Copy link
Contributor Author

My vote is to support Option 2:

  1. Hard to explain that views are not there. We clearly separate that views are only for sdk.MeterProvider.

Agreed. That was one of the downsides discussed. But given that we already have to explain that aggregation/ aggregation temporality configuration doesn't apply, it seems reasonable.

  1. exporter configuration for the MeterProvider would be shared.
    • This is true for the option 2 as well, since the MetricReader may be shared, isn't it?

You are correct that you can share the metric readers still with option 2. It just means you have to pass the metric readers (and resource) as options to both the sdk and to bridges separately.

  1. MeterProvider will implement the same interface as any other producer.

This is still possible with the current design, but I considered it an implementation detail.

  1. Indeed user must configure Resource maybe twice, but that is easier in my opinion compare with explaining that the resource will be overwritten. (see "Resource MUST be provided by the MeterProvider, and any resource information provided by the bridge MUST be overridden.")

This may only apply to java (see #2722 (comment)), but I haven't checked other languages. The benefit of this is that you always end up grouping the metrics together under a single resource, which was important, if I'm remembering the discussion correctly.

Comment on lines +852 to +853
`InstrumentationScope()`, and with the `MeterProvider`'s configured resource if
it is possible for those to conflict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "if it is possible for those to conflict." mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it to mean: "If a batch of metric points from a MetricProducer can conflict with the MeterProvider's resource". This would be possible if a "batch of metric points" includes resource information. That wouldn't be the case in Go, but would be in Java given their current definition of "batch of metric points".

I could remove "if it is possible for those to conflict", or I could change it to "...configured resource. If a resource or scope is already present in a batch of metric points from a MetricProducer, that MUST be overridden with the MetricProducer's Instrumentation Scope and the MeterProvider's resource".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I see this was also discussed in #2722 (comment). Now that I understand the Java SDK has Resource in its data object, I understand this point, which helps explain @bogdandrutu's remarks in #2722 (comment).

@jmacd
Copy link
Contributor

jmacd commented Sep 22, 2022

It appears there is lingering debate over how to require that Resources are-or-are-not-definitely-the-same between the bridged metrics and those from other Meters (and the same concern for Scopes, practically speaking). Here's what is written in the Resource SDK spec

When used with distributed tracing, a resource can be associated 
with the [TracerProvider](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#tracerprovider) 
when the TracerProvider is created. That association cannot be changed later. 
When associated with a TracerProvider, all Spans produced by any Tracer 
from the provider MUST be associated with this Resource.

Analogous to distributed tracing, when used with metrics,
a resource can be associated with a `MeterProvider`.
When associated with a [`MeterProvider`](../metrics/api.md#meterprovider),
all metrics produced by any `Meter` from the provider will be
associated with this `Resource`.

It's that "analogous" MUST we are defending here. It appears the stable Java SDK has a loophole, and now we have wrinkles in the specification because of it. I recommend @dashpole remove all the wrinkle about Resources "if it is possible for those to conflict", leave the specification stating that each producer has an instrumentation scope, and let the Java SDK document its loophole (i.e., it should document that users are required to use the same resource twice or else be out-of-spec).

I expect that to make the Producer interface the same as the internal representation of a Meter has; I expect that means the Producer will be invoked to produce a batch of Metric objects just as one Meter would do; therefore I expect the MetricProducer to produce the same data objects that any other Meter would, because the Scope and Resource are already determined at that level in the data hierarchy.

@bogdandrutu can you clarify if whether you have any reservations not covered above?

@jmacd
Copy link
Contributor

jmacd commented Sep 22, 2022

MetricProducer to produce the same data objects that any other Meter would, because the Scope and Resource are already determined at that level in the data hierarchy.

p.s., I'm not opposed to allowing MetricProducers to return more than one scope. A MetricProducer returning multiple Scopes equals multiple single-Scope MetricProducers, IMO, what's important to me is that a MetricProducer doesn't require the MetricReader to sort data points by Scope to successfully export the data.

@jmacd
Copy link
Contributor

jmacd commented Sep 26, 2022

Would @ahayworth @pirgeo @MadVikingGod and @bogdandrutu in particular please read through the above dialog and see if you agree to approve the PR in its current form? The four of you have commented without approving.

Recall that this is absolutely a barrier to finishing the OTel-Go OpenCensus bridge, which is at least partlyt responsible for stalling development on Collector observability improvements. Thanks.

@ahayworth
Copy link
Contributor

Would @ahayworth [...] please read through the above dialog and see if you agree to approve the PR in its current form? The four of you have commented without approving.

I have no strong feelings on the PR; I happened to notice a typo previously (which was the only thing in my comment). ❤️

Comment on lines +89 to +93
If a meter is created which produces an
[`InstrumentationScope`](../glossary.md#instrumentation-scope), which matches
the InstrumentationScope of a [MetricProducer](#metricproducer), or if multiple
[MetricProducers](#metricproducer) have the same InstrumentationScope the SDK
SHOULD emit a warning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like 2 separate warnings/errors.

  1. If an instrument is created, after applying views, such that it's scope conflicts with a MetricProducer. If we create a meter that conflicts but never create an instrument that does, because of scope rename or not creating any instruments, is there an warning?
  2. When registering MetricProducers if they were to collide then there should be an error/warning. This should be a new norm forming sentence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is two separate statements, but it doesn't say anything about per-instrument warnings. I believe we should emit warnings for the conflicting scopes, but allow the instruments to be correctly registered into each. IOW the presence of MetricProducers could produce scope warnings, not instrument warnings. The only way to get instrument warnings should be to have conflicts within a scope (IMO).

@MadVikingGod
Copy link
Contributor

I'm ok with the current PR, as I think it offers us a way to short-term migrate from OC to otel without the re-instrument everything step. I think this will be implementable as written.

I still have a preference for approach 2, for two reasons.

  1. I think it will be simpler to implement in go by creating a wrapped reader.
  2. I think it makes it easier to explain what is happening in the following contrived example. If you have 2x readers, and 1x MetricProducer that produces delta metrics, neither reader would get the entire stream of data from the producer. As it is written now the MetricProducer will be shared by the MeterProvider among all Readers. In approach 2 the user would have to explicitly reuse the producer between 2 or more readers.

Because of 2 I would also recommend that we have some warning on any producers we create that can make delta metrics. They shouldn't be used with multiple readers.

@jmacd
Copy link
Contributor

jmacd commented Sep 26, 2022

I would also recommend that we have some warning on any producers we create that can make delta metrics.

I would be happy to add that MetricProducers MUST NOT use delta temporality, "for they are expected to be stateless" with respect to the MetricReader.

@jmacd
Copy link
Contributor

jmacd commented Sep 26, 2022

@dashpole I had been confused (and I suspect others have bene) by the option-numbering scheme mentioned in the PR description, especially considering the numbered-list appearing in (and later quoted) #2722 (comment).

@MadVikingGod's points are well taken in arguing for the bridge to be implemented as a wrapped MetricReader. Implementing the bridge inside the Reader is both simpler and allows the correct use of Delta Temporality. The downside of that approach is that warnings will not be realized until the first time they are read. This is completely fine, in my opinion.

@dashpole I recommend closing this PR, revising the contents of this PR, and opening a new one where we do not list numbered alternatives.

@dashpole dashpole closed this Sep 26, 2022
@dashpole
Copy link
Contributor Author

I would be happy to add that MetricProducers MUST NOT use delta temporality, "for they are expected to be stateless" with respect to the MetricReader.

It seems like metric reader could still be stateless even with a stateless bridge/reader/exporter if both the bridge and exporter prefer delta. If there was a client that natively produced delta temporality metrics, I don't think we'd want to disallow bridging that to compatible exporters, right?

@dashpole
Copy link
Contributor Author

I opened #2838, which removes "if it is possible for those to conflict" language, splits the conflicting-scope warning sentence into two, and removes the numbered options from the description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants