Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate collector's metrics from OC to OpenTelemetry #816

Closed
Tracked by #7454
pavolloffay opened this issue Apr 9, 2020 · 24 comments · Fixed by #9102
Closed
Tracked by #7454

Migrate collector's metrics from OC to OpenTelemetry #816

pavolloffay opened this issue Apr 9, 2020 · 24 comments · Fixed by #9102
Assignees
Labels
help wanted Good issue for contributors to OpenTelemetry Service to pick up release:required-for-ga Must be resolved before GA release

Comments

@pavolloffay
Copy link
Member

Currently, the collector uses OpenCensus API for internal observability

.

These metrics should be probably migrated to OpenTelemetry API.

Jaeger storage components use abstract metrics API and so we need to export these metrics in Jaeger OTEL collector e.g.

https://github.com/jaegertracing/jaeger/blob/bf870d9e2bffaf4758045919446be75ac9fc9783/cmd/opentelemetry-collector/app/exporter/cassandra/exporter.go#L31.

Are there plans to migrate to OpenTelemetry metrics? In Jaeger we want to avoid first creating OC adapter and then OTEL once this repo migrates. I can submit a PR to port the metrics to OTEL API.

@yurishkuro yurishkuro changed the title Migrate collenctor's metrics fro OC to OpenTelemetry Migrate collector's metrics from OC to OpenTelemetry Apr 9, 2020
@yurishkuro
Copy link
Member

This would also provide the needed dogfooding for OTel SDK.

@bogdandrutu
Copy link
Member

@yurishkuro completely agree, we are waiting for 2 things:

  1. The ability to use labels from the correlation-context.
  2. The ability to configure Views - allow us to configure which labels from the correlation-context to use.

If we don't have these two things we need to manually propagate/retrieve these values every time we do a report. Which may be possible if we do have couple of helper functions. So probably the main missing feature is to support changing the default aggregations for some instruments like for distributions we don't want just minmaxsumcount we want a histogram.

  • @jmacd to help correct me if we can already do the histogram thing.

@yurishkuro
Copy link
Member

Why do you need context labels in the collector? Sounds a bit like nice to have, not a showstopper.

@bogdandrutu
Copy link
Member

I think I also tried to explain myself that we have the solution to use helper functions, but let me describe the design a bit:

  • In the pipeline we want to record at every step metric broken down by the "receiver" so we add receiver to the context to propagate it in the pipeline instead of having it as an explicit argument. So we can have helper functions that just retrieve that from the context and use it in the metrics as labels.

And as I explained is not a showstopper, the current blocker is the ability to configure the aggregation for the distribution to be histogram vs minmaxsumcount, and probably a nice to have is the ability to select the labels for the aggregator (again not a showstopper).

@bogdandrutu
Copy link
Member

@pavolloffay to answer your question, we do want to use opentelemetry metrics. probably tracing as well :)

@pavolloffay
Copy link
Member Author

thanks @bogdandrutu. I can open a PR to port the metrics.

Is there any issue in opentelemetry-go for:

So probably the main missing feature is to support changing the default aggregations for some instruments like for distributions we don't want just minmaxsumcount we want a histogram.

@bogdandrutu
Copy link
Member

I think the missing thing is this otep open-telemetry/oteps#89

@tigrannajaryan tigrannajaryan added the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Apr 22, 2020
@mtwo mtwo added the release:required-for-ga Must be resolved before GA release label Jul 29, 2020
@mtwo mtwo added this to the GA 1.0 milestone Jul 29, 2020
@tigrannajaryan
Copy link
Member

@bogdandrutu I think I remember some recent work that you did on this and changed telemetry to use Otel API instead of OpenCensus. Is that correct? If so I can close this.

@tigrannajaryan tigrannajaryan modified the milestones: GA 1.0, Backlog Oct 2, 2020
@nilebox
Copy link
Member

nilebox commented Oct 6, 2020

Collector is still using OpenCensus in its self-observability pipeline:

"go.opencensus.io/stats/view"

There is #1833 switching traces to Otel API (not merged yet), but that still doesn't include metrics.

@mxiamxia
Copy link
Member

mxiamxia commented May 11, 2021

The scope to resolve this issue is vague. Are we planing to only remove the OC metric reference in public API for unblocking Tracing GA? Because it's hard to replace all OC metrics in all those metrics components in core and contrib repos before tracing GA timeframe.

Need help to elaborate the exit criteria/scope on this issue? And we can contribute for the changes :)

@alolita
Copy link
Member

alolita commented May 12, 2021

@bogdandrutu since you mentioned that this issue is not a showstopper can it be moved to a post-GA phase 3 backlog?

@alolita
Copy link
Member

alolita commented May 18, 2021

@bogdandrutu: We will evaluate all the core components for any public API dependence on OC metrics references and respond on issue.

@alolita alolita added the discussion-needed Community discussion needed label May 18, 2021
@mxiamxia
Copy link
Member

Discussed it in SIG with @tigrannajaryan and @bogdandrutu , this issue is currently blocked by open-telemetry/oteps#89.

@bogdandrutu
Copy link
Member

Block by open-telemetry/opentelemetry-go#2204

@yurishkuro
Copy link
Member

@bogdandrutu open-telemetry/opentelemetry-go#2204 is about OpenCensus bridge, why is it a concern? I think the path would be to rewrite instrumentation to natively talk to OTEL-Go API, not to continue going through OC API.

@fatsheep9146
Copy link
Contributor

@bogdandrutu open-telemetry/opentelemetry-go#2204 is about OpenCensus bridge, why is it a concern? I think the path would be to rewrite instrumentation to natively talk to OTEL-Go API, not to continue going through OC API.

@yurishkuro

Since @dashpole provided a mature propose that the metrics migration would use OpenCensus bridge for OpenTelemetry, like trace.
#3816 (review)

As an alternative, I would propose that we use both opencensus and opentelemetry client libraries at the same time while we migrate. We can use the OpenCensus bridge for OpenTelemetry to make both opencensus and opentelemetry clients serve metrics on the same prometheus endpoint.
We could roll this out with the following steps:

  • Switch from the OpenCensus prometheus exporter to a "wrapped" OpenTelemetry prometheus exporter. Verify that self obs metrics on the prometheus endpoint remain the same.
  • Add the OpenTelemetry API/SDK and use the same prometheus exporter that OC uses.
  • Migrate each metric from OC to OTel individually, and verify that the metric doesn't change.

But this propose is blocked by issue #2204, the prometheus exporter does not satisfy the requirement for OpenCensus metric bridge for OpenTelemetry, which is described in https://pkg.go.dev/go.opentelemetry.io/otel/bridge/[email protected]#readme-metrics

@jpkrohling
Copy link
Member

We discussed this a couple of SIG calls ago, and @jaronoff97 was set to check what's our current state by working on a couple of components first and reporting back.

@jaronoff97
Copy link
Contributor

@jpkrohling Our team is currently looking in to this. @paivagustavo has been working on what a migration may look like and has already confirmed that we can provide the same prometheus metrics (and histogram buckets) with the latest Go SDK release. I'll let Gustavo provide some more details.

@paivagustavo
Copy link
Member

I've been working on testing the latest released otel go sdk and verifying that it would met the requirements here. As part of this I've instrumented the existing receiver metrics from obsreport.Receiver(#6222) and the loadbalancingexporter from contrib, more specifically testing that we could use different histogram buckets for specific metrics. Using the views api, we are able to accomplish that.

About the bridge, as it is in review at the moment (open-telemetry/opentelemetry-go#3207), I would suggest us to start to double instrumenting the collector with otel go in the meantime. If the bridge is merged and released before we finish migrating the current metrics, we can enable it and start removing oc metrics that have already being instrumented with otel-go.

hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this issue Apr 27, 2023
…ry#816)

Bumps [docker](https://github.com/docker/docker-py) from 5.0.2 to 5.0.3.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](docker/docker-py@5.0.2...5.0.3)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@codeboten codeboten self-assigned this Dec 1, 2023
codeboten pushed a commit that referenced this issue Jan 16, 2024
This marks the flag as stable. Leaving this as a draft until v0.92.0 is
released.

Closes #8962
Fixes #816

---------

Signed-off-by: Alex Boten <[email protected]>
dmitryax pushed a commit to open-telemetry/opentelemetry-collector-contrib that referenced this issue Jan 29, 2024
**Description:** 
This enables `goleak` checks for all packages whose only failure is due
to [a known
issue](census-instrumentation/opencensus-go#1191)
within the `opencensus-go` package. I believe it's safe to ignore for
two main reasons:
1. All references that I'm aware of are indirect, the contrib repo isn't
directly using the stat's worker functionality. The goroutine is started
in the dependency's `init()` so it's not something that we're causing.
2. We're actively moving away from the opencensus dependency, and hope
to remove it entirely soon. Relevant issue:
open-telemetry/opentelemetry-collector#816.
Once this work is complete the ignore will no longer be necessary and we
can safely remove it.

The only changes in this PR are test related. The same file is being
copied into different packages, and then we run `make gotidy`. No
functionality has been changed.

**Link to tracking Issue:** #30438
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this issue Feb 1, 2024
…try#30457)

**Description:** 
This enables `goleak` checks for all packages whose only failure is due
to [a known
issue](census-instrumentation/opencensus-go#1191)
within the `opencensus-go` package. I believe it's safe to ignore for
two main reasons:
1. All references that I'm aware of are indirect, the contrib repo isn't
directly using the stat's worker functionality. The goroutine is started
in the dependency's `init()` so it's not something that we're causing.
2. We're actively moving away from the opencensus dependency, and hope
to remove it entirely soon. Relevant issue:
open-telemetry/opentelemetry-collector#816.
Once this work is complete the ignore will no longer be necessary and we
can safely remove it.

The only changes in this PR are test related. The same file is being
copied into different packages, and then we run `make gotidy`. No
functionality has been changed.

**Link to tracking Issue:** open-telemetry#30438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Good issue for contributors to OpenTelemetry Service to pick up release:required-for-ga Must be resolved before GA release
Projects
None yet
Development

Successfully merging a pull request may close this issue.