prometheusreceiver: possible incorrect metric name for cumulative metrics #13117

newly12 · 2022-08-09T08:37:50Z

Describe the bug
given metrics like the following, exported metrics will only be envoy_cluster_upstream_rq

# TYPE envoy_cluster_upstream_rq counter
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="local_envoy_admin"} 8130
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="local_service83"} 8265
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="sds_server_uds"} 1
envoy_cluster_upstream_rq{envoy_response_code="503",envoy_cluster_name="sds_server_uds"} 2

# TYPE envoy_cluster_upstream_rq_total counter
envoy_cluster_upstream_rq_total{envoy_cluster_name="local_envoy_admin"} 8131
envoy_cluster_upstream_rq_total{envoy_cluster_name="local_service80"} 0
envoy_cluster_upstream_rq_total{envoy_cluster_name="local_service81"} 0
envoy_cluster_upstream_rq_total{envoy_cluster_name="local_service82"} 0
envoy_cluster_upstream_rq_total{envoy_cluster_name="local_service83"} 8265
envoy_cluster_upstream_rq_total{envoy_cluster_name="sds_server_uds"} 1

when envoy_cluster_upstream_rq appears first, the family name will be envoy_cluster_upstream_rq, normalizeMetricName returns the same metric name, later when envoy_cluster_upstream_rq_total comes, after normalized, it will be envoy_cluster_upstream_rq, thus they will be grouped into the same metric family

Steps to reproduce
test prometheus receiver with above metrics

What did you expect to see?
4 envoy_cluster_upstream_rq metrics and 6 envoy_cluster_upstream_rq_total metrics

What did you see instead?
10 envoy_cluster_upstream_rq metrics.

What version did you use?
Version: (e.g., v0.4.0, 1eb551b, etc)

What config did you use?
Config: (e.g. the yaml config file)

Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

newly12 · 2022-08-10T11:45:55Z

by looking at the code, I understand the need to trim necessary suffix so to group metrics together for histogram & summary, not quite sure why sum/counter type is needed?

github-actions · 2022-08-11T08:13:58Z

Pinging code owners: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.

bogdandrutu · 2022-09-07T05:21:07Z

@dashpole is this a correct OpenMetrics input? Or a Prometheus? Based on https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#counter-1 we should have something like:

# TYPE envoy_cluster_upstream_rq counter
envoy_cluster_upstream_rq_total{envoy_response_code="200",envoy_cluster_name="local_envoy_admin"} 8130

dashpole · 2022-09-07T15:18:52Z

@bogdandrutu what you pasted is valid for OpenMetrics, but not valid for prometheus. I'm not aware of a way for us to know which format (OM vs prom) the scrape was in, so we have to choose which to handle correctly:

OpenMetrics (only one valid option):

# TYPE envoy_cluster_upstream_rq counter
envoy_cluster_upstream_rq_total{envoy_response_code="200",envoy_cluster_name="local_envoy_admin"} 8130

Vs Prometheus:

# TYPE envoy_cluster_upstream_rq counter
envoy_cluster_upstream_rq{envoy_response_code="200",envoy_cluster_name="local_envoy_admin"} 8130
# TYPE envoy_cluster_upstream_rq_total counter
envoy_cluster_upstream_rq_total{envoy_response_code="200",envoy_cluster_name="local_envoy_admin"} 8130

Note that # TYPE envoy_cluster_upstream_rq counter is valid in prometheus, but is strongly discouraged, as it doesn't abide by Prometheus naming conventions (counters should end in _total). Since we had to choose between supporting # TYPE envoy_cluster_upstream_rq counter in Prometheus and supporting the OpenMetrics naming case, we chose to support the OM naming convention.

bogdandrutu · 2022-09-08T03:20:40Z

Thanks for the explanation, is anyone in the Prometheus ecosystem that can tell us their opinion?

dashpole · 2022-09-08T15:06:57Z

I'll raise it for discussion at the prometheus working group

dashpole · 2022-09-28T15:14:51Z

Discussed at the prometheus WG today:

We can ask for OM vs Prometheus to be passed to appenders. Maybe in the append context? (tag RichiH, codesome)

newly12 added the bug Something isn't working label Aug 9, 2022

newly12 mentioned this issue Aug 11, 2022

[prometheusreceiver] Fix metrics being grouped into the same metrics family incorrectly #13213

Merged

mx-psi added the exporter/prometheus label Aug 11, 2022

dmitryax added the priority:p2 Medium label Aug 12, 2022

codeboten closed this as completed in #13213 Oct 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prometheusreceiver: possible incorrect metric name for cumulative metrics #13117

prometheusreceiver: possible incorrect metric name for cumulative metrics #13117

newly12 commented Aug 9, 2022 •

edited

Loading

newly12 commented Aug 10, 2022 •

edited

Loading

github-actions bot commented Aug 11, 2022

bogdandrutu commented Sep 7, 2022

dashpole commented Sep 7, 2022

bogdandrutu commented Sep 8, 2022

dashpole commented Sep 8, 2022

dashpole commented Sep 28, 2022

prometheusreceiver: possible incorrect metric name for cumulative metrics #13117

prometheusreceiver: possible incorrect metric name for cumulative metrics #13117

Comments

newly12 commented Aug 9, 2022 • edited Loading

newly12 commented Aug 10, 2022 • edited Loading

github-actions bot commented Aug 11, 2022

bogdandrutu commented Sep 7, 2022

dashpole commented Sep 7, 2022

bogdandrutu commented Sep 8, 2022

dashpole commented Sep 8, 2022

dashpole commented Sep 28, 2022

newly12 commented Aug 9, 2022 •

edited

Loading

newly12 commented Aug 10, 2022 •

edited

Loading