Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gauge for max exporter queue capacity #6513

Open
swar8080 opened this issue Jun 10, 2024 · 5 comments
Open

Add gauge for max exporter queue capacity #6513

swar8080 opened this issue Jun 10, 2024 · 5 comments
Labels
Feature Request Suggest an idea for this project

Comments

@swar8080
Copy link

Is your feature request related to a problem? Please describe.
Hello, i'm looking to monitor when exporter queue size gets close to reaching max capacity. Currently i'm hardcoding the max queue size with a query like queueSize / 2048. We may increase that limit for some applications in the future, so it'd be nice to avoid hardcoding the size limit with a query like queueSize / maxQueueSize

Describe the solution you'd like
Add a gauge that reports that configured maximum export queue size

Describe alternatives you've considered
Keeping the configured maximum in-sync between applications and alerts

Additional context
I'm monitoring this on behalf of all application teams so it'd be nice to have a single alert configuration that works for all applications

@swar8080 swar8080 added the Feature Request Suggest an idea for this project label Jun 10, 2024
@jkwatson
Copy link
Contributor

Since this is a constant for the lifetime of a VM, could you simply create the gauge yourself at the moment when you're configuring the SDK?

@swar8080
Copy link
Author

We could, but it would be nice if the teams I support don't have to make code changes. Implementing this within the SDK means we can use the OTEL operator + java agent to automatically roll this out to all our applications. I'm assuming we'd just have to update the version of the java agent that the operator injects

@jkwatson
Copy link
Contributor

🤔 I don't know the state of whether the metrics that we're emitting currently are official, or more experimental. Does the specification require a set of metrics to be exported by the batch span processor? I wrote the original metrics several years ago, but I know there was a desire to have the metrics be "official" via specification, and I've lost track of whether that has been done yet.

@swar8080 swar8080 changed the title Add gauge for max exporter queue Add gauge for max exporter queue capacity Jun 10, 2024
@trask
Copy link
Member

trask commented Jun 11, 2024

hi @swar8080, there's some ongoing work, see open-telemetry/semantic-conventions#598

assuming that proposal moves forward (it's stalled currently but expect it to be picked back up at some point), I think you could instead alert on the metric otelcol.processor.items where otel.outcome=queue_full if my understanding of your requirement is correct?

@swar8080
Copy link
Author

Thanks @trask i'll keep an eye on the sdk metric convention

Ideally we could alert when the export queue is close to filling up so that we can hopefully increase the resources/limit before drops happen

Maybe we could also do this with an agent extension now that the OTEL operator allows injecting them without any code changes for service owners

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Suggest an idea for this project
Projects
None yet
Development

No branches or pull requests

3 participants