Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support percentiles for aggregated metrics #1226

Closed
andyvig opened this issue Sep 30, 2019 · 17 comments
Closed

Support percentiles for aggregated metrics #1226

andyvig opened this issue Sep 30, 2019 · 17 comments

Comments

@andyvig
Copy link

andyvig commented Sep 30, 2019

<>
Per this StackOverflow answer, it’s not possible to do percentiles on aggregated metrics sent through AppInsights.
https://stackoverflow.com/questions/58124268/how-to-do-percentiles-on-custom-metrics-in-azure-appinsights

The request is to support this in some form, since it seems like a significant miss relative to other platforms like Prometheus.
Is there any workaround other than sending telemetry for every metric measurement (since that won’t scale at all)?

I would love not to have to set up Prometheus/Grafana infrastructure to support this. Thanks!

@cijothomas cijothomas added this to the Future milestone Sep 30, 2019
@cijothomas
Copy link
Contributor

@vgorbenko Is this something in Metrics roadmap.. ?

@andyvig
Copy link
Author

andyvig commented Oct 14, 2019

Any indication of how close this might be?
For high-volume scenarios it makes AppInsights unusable for metrics (since simple averages won't cut it for production monitoring). If there's a solution AppInsights provides here that I'm missing please let me know (our plan is to track aggregate metrics for billions of events/day).

@cijothomas
Copy link
Contributor

@andyvig This is not planned for 2019. I will check and report back the plan for next semester. (2020).
I also know that its possible for you to write custom aggregator and plug into rest of metrics pipeline if you want to do percentlies. Its not documented, but if you want to take a look, heres where to start looking:
https://github.com/microsoft/ApplicationInsights-dotnet/blob/develop/src/Microsoft.ApplicationInsights/Metrics/Extensibility/MetricSeriesAggregatorBase.cs

@andyvig
Copy link
Author

andyvig commented Oct 14, 2019

Thanks @cijothomas, how would we then query that on the Log Analytics side? Does the percentile function support aggregate data?
I'm looking for something similar to this operation in Prometheus:
"To calculate the 90th percentile of request durations over the last 10m"
histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))
From https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile

@cijothomas
Copy link
Contributor

Don't think there exists any native support as schema dont have anything for storing percentiles.https://github.com/microsoft/ApplicationInsights-dotnet/blob/develop/src/Microsoft.ApplicationInsights/Extensibility/Implementation/External/DataPoint_types.cs

You'd need to store quantiles as customProps, and do custom queries to get them, as Analytics wont understand customProps.

@SergeyKanzhelev even if one authors own aggregator, any way to store quantiles (.1,.5..9 etc) in schema?

@RicardoNiepel
Copy link

Any news / roadmap item / documentation / customer guidance of

  • publishing metrics as histograms to AppInsights
  • with the goal of using percentiles in Queries/Views/Alerts

to make AppInsights a good fit for SLOs?

@cijothomas
Copy link
Contributor

No work is planned to add support for this in ApplicationInsights SDK.

The Metrics support in OpenTelemetry is coming by end of 2021 (nov 2021) - open-telemetry/opentelemetry-dotnet#1501.
After the OpenTelemetry part is shipped, there'd be a supported way to export metrics to ApplicationInsights, but no solid dates for this. Also no solid date for supporting percentiles/histogram in ApplicationInsights.

@reyang reyang removed this from the Future milestone Mar 9, 2021
@github-actions
Copy link

github-actions bot commented Jan 4, 2022

This issue is stale because it has been open 300 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Jan 4, 2022
@andyvig
Copy link
Author

andyvig commented Jan 4, 2022

@cijothomas Checking in here...
This still a "maybe sometime in the future but all dates unknown" situation or is there any more definition around if/when this might be supported? Thanks.

@cijothomas
Copy link
Contributor

No firm dates that I can share. (the feature requires not just SDK support, but backends/UI etc.). From SDK side, this will likely come via OpenTelemetry route, and not from this repo.

@github-actions github-actions bot removed the stale label Jan 5, 2022
@github-actions
Copy link

github-actions bot commented Nov 1, 2022

This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.

@github-actions github-actions bot added the stale label Nov 1, 2022
@RicardoNiepel
Copy link

We still don't have a clear statement, if and how this will come.

If this feature is not coming in the Azure Monitor / AppInsights backend and various SDKs, there should be some guidance published, how these technologies could be used if someone wants to follow SRE best practices:

@mattmccleary
Copy link
Member

mattmccleary commented Nov 2, 2022

Hi Ricardo, Azure Managed Prometheus (Preview) was announced last month and is available with Azure Managed Grafana integration. This is compatible with Prom Client.

https://learn.microsoft.com/azure/azure-monitor/essentials/prometheus-metrics-overview

Additionally we are working on supporting percentiles via the OpenTelemetry histogram API. Unfortunately this work requires some major changes in how our backend works and thus any release is likely 6+ months out.

CC: @vishiy

@github-actions github-actions bot removed the stale label Nov 3, 2022
@RicardoNiepel
Copy link

Thanks a lot for clarification and details around workarounds/other possibilities.

@github-actions
Copy link

This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.

@github-actions github-actions bot added the stale label Aug 31, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 8, 2023
@RicardoNiepel
Copy link

@mattmccleary can you provide an update on this? Thx

@dennis-yemelyanov
Copy link

@mattmccleary any update on percentile tracking support?

I'm trying to use 'Azure.Monitor.OpenTelemetry.Exporter' to collect and report on our application latency. Looks like currently it tracks things like max value, but it's not very useful for practical purposes since max value can be influenced by a lot of external factors and doesn't necessarily provide an accurate view of how the app is doing. Ideally we want to track the 99-th percentile of this latency value, but I can't figure out how to do that or if it's supported at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants