Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Rename metrics name to comply with feature flag semconv #712

Closed
Kavindu-Dodan opened this issue Jun 19, 2023 · 15 comments · Fixed by #730
Closed

[FEATURE] Rename metrics name to comply with feature flag semconv #712

Kavindu-Dodan opened this issue Jun 19, 2023 · 15 comments · Fixed by #730
Assignees
Labels
enhancement New feature or request

Comments

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Jun 19, 2023

Requirements

Background

flagd comes with out of the box metrics support [1]. It currently expose the following metric names,

  • http_request_duration_seconds
  • http_response_size_bytes
  • impressions
  • reasons

While these metrics serve a good purpose, the names do not comply with semantic conventions. For comparison, see how metrics are named for the metrics hook [2] (ex:- feature_flag.evaluation_requests_total, feature_flag.evaluation_success_total)

Proposal (updated 27/06/2023)

This issue focuses on renaming flagd metrics to comply with semantic conventions.

Common rules

  • Instrumentation Scope: flagd
  • Version: current flagd version
  • Feature flag specific metric name must start with feature_flag prefix so that monitoring systems can aggregate feature flag related metrics
  • Feature flag specific metric name must contain flagd phrase so that it is clear the metric belongs to flagd system

With this background, consider the following proposals,

Old New Unit Refer
http_request_duration_seconds http.server.duration S [3]
http_response_size_bytes http.server.response.size By [4]
impressions feature_flag.flagd.impression {impression} [5]
reasons feature_flag.flagd.evaluation.reason {reason} [5]

[1] - https://github.com/open-feature/flagd/blob/main/core/pkg/telemetry/metrics.go
[2] - https://github.com/open-feature/go-sdk-contrib/blob/main/hooks/open-telemetry/README.md#metric-hook
[3] - https://github.com/open-telemetry/semantic-conventions/blob/bef5a68c2f580e48334bfb2a118345808b2e2125/specification/metrics/semantic_conventions/http-metrics.md#metric-httpserverduration
[4] - https://github.com/open-telemetry/semantic-conventions/blob/bef5a68c2f580e48334bfb2a118345808b2e2125/specification/metrics/semantic_conventions/http-metrics.md#metric-httpserverrequestsize
[5] - https://github.com/open-telemetry/semantic-conventions/tree/bef5a68c2f580e48334bfb2a118345808b2e2125/specification/metrics/semantic_conventions#pluralization

@Kavindu-Dodan Kavindu-Dodan added enhancement New feature or request Needs Triage This issue needs to be investigated by a maintainer labels Jun 19, 2023
@Kavindu-Dodan
Copy link
Contributor Author

@thisthat @toddbaert @beeme1mr appreciate feedback on this proposal

@thisthat
Copy link
Member

Great summary @Kavindu-Dodan - the proposal makes a lot of sense. However, OTel suggests adding count to the namespace for pluralization, e.g., feature_flag.flagd.impression.count. This would require also to adapt the hooks tho 🤔

@pirgeo
Copy link

pirgeo commented Jun 20, 2023

Hi! Here are a few thoughts, mainly looking at the spec you linked above:

  • This is not a "hard" requirement, but personal taste: I think I prefer featureflag. over feature_flag., but I couldn't find any guidance in the Otel spec. All the other top-level namespaces are a single word (or even abbreviations, like db.).
  • For HTTP, the semantic conventions use the dot instead of the underscore, e.g., http.server.response.size. Maybe moving to feature_flag.flagd.http.response.size would be more in line with existing semantic conventions.
  • Here is a PR that intends to only add .count for UpDownCounters. Not sure if the counters in question are UpDownCounters or "just up" Counters, but it might be worth following that guidance there. The fact that the type of the metric is "Count" already lets us determine that this metric is a counter, we don't necessarily need to pack it into the name. The whole point of this discussion was Prometheus compatibility, if you add .count out of the box you might end up with _count_total for the Prometheus exporter, which is... not ideal.

@beeme1mr
Copy link
Member

Thanks @pirgeo! I would prefer to use feature_flag as a prefix to match what has been defined in the semantic convention for traces but I agree with the other points.

@pirgeo
Copy link

pirgeo commented Jun 21, 2023

Sounds good! I neither found any guidance on the spec nor do I have a strong opinion on this one.

@joaopgrassi
Copy link

Apart from what @pirgeo already mentioned, here are my suggestions:

About the units: It's a bit hidden but there's some extra guidance here: https://github.com/open-telemetry/semantic-conventions/blob/main/specification/metrics/semantic_conventions/README.md#instrument-units

  • feature_flag.flagd.http_request_time: Should be s
  • feature_flag.flagd.impression_count and feature_flag.flagd.evaluation_reason_count: I think those are under the "default" unit 1:

Instruments for utilization metrics (that measure the fraction out of a total) are dimensionless and SHOULD use the default unit 1 (the unity).

If you want to capture more meaning, you could use {impression} and {reason} as stated in the spec:

Instruments that measure an integer count of something SHOULD only use annotations with curly braces to give additional meaning without the leading default unit (1). For example, use {packet}, {error}, {fault}, etc.

feature_flag.flagd.http_request_time

Is there a reason why the metric changed to "request_time"? I think the old name actually aligned better with the current HTTP semconv, such as http.server.duration.

@Kavindu-Dodan
Copy link
Contributor Author

Kavindu-Dodan commented Jun 21, 2023

@thisthat @pirgeo @beeme1mr & @joaopgrassi for the feedback Thank you very much for taking the time to give your opinions.

The fact that the type of the metric is "Count" already lets us determine that this metric is a counter, we don't necessarily need to pack it into the name

Thank you for this input

Is there a reason why the metric changed to "request_time"? I think the old name actually aligned better with the current HTTP semconv, such as http.server.duration.

Should be an oversight and the original description is meant to be changed.

If you want to capture more meaning, you could use {impression} and {reason} as stated in the spec:

Good point. I assume this allows us to remove the count suffix


To summarise, can we agree on the following updated proposal?

A new addition to the common rule,

  • Adopt dot(.) notation instead of underscore (_)

And updated metrics chart,

Old New Unit
http_request_duration_seconds feature_flag.flagd.http.request.duration s
http_response_size_bytes feature_flag.flagd.http.response.size By
impressions feature_flag.flagd.impression {impression}
reasons feature_flag.flagd.evaluation.reason {reason}

Let me know your thoughts and if we agree, I will update the issue description and continue with the implementation.

@beeme1mr
Copy link
Member

Are the metrics feature_flag.flagd.http.request.duration and feature_flag.flagd.http.response.size only captured during flag evaluation? If so, then this looks good to me.

@pirgeo
Copy link

pirgeo commented Jun 22, 2023

Just for clarification: Do feature_flag.flagd.http.request.duration and feature_flag.flagd.http.response.size refer to the requests that flagd sends out in order to get feature flag settings?

Will the feature_flag.flagd.evaluation.reason have an attribute that states the actual reason? Like: feature_flag.flagd.evaluation.reason with an attribute reason=user-changed-flag or something? In that case, I wonder if dropping the reason from the metric name would be a good idea, since you are basically counting evaluations, and can then split by reason. With how it is currently, you are counting reasons. This might be intended, but I know too little about the underlying data. If the reason is something that is highly changeable and for a user to set, keeping it as is would be good. If there is a predefined set of valid reasons, going with the dimension might be a better idea (but again: take it with a pinch of salt, my feature flag knowledge is shallow).

Finally: What are the Otel datatypes of these? I assume histograms for the first two, and counts for the last two? Both of these should only ever go up, I assume.

@joaopgrassi
Copy link

joaopgrassi commented Jun 22, 2023

feature_flag.flagd.impression -> I think this can now be pluralized feature_flag.flagd.impressions, since you have the unit {impression} From the spec:

Metric names SHOULD NOT be pluralized, unless the value being recorded represents discrete instances of a countable quantity. Generally, the name SHOULD be pluralized only if the unit of the metric in question is a non-unit (like {fault} or {operation}).

Same for the other (depending on how you address @pirgeo comment above)

About the request.duration and response.size I'm not sure if we need to redefine them under the feature_flag namespace. Couldn't Flagd simply adopt the HTTP semantic conventions for these, since they are basically the same? That's what HTTP client instrumentations for example do.

@Kavindu-Dodan
Copy link
Contributor Author

Just for clarification: Do feature_flag.flagd.http.request.duration and feature_flag.flagd.http.response.size refer to the requests that flagd sends out in order to get feature flag settings?

No, this correlates to the flag evaluation time of flagd & the size of the payload of the evaluation

And regarding reason vs impression, I think they have value on their own. impression represents successful falg evaluations where reason represent anything (both success or error) situations. Besides, OpenFeature define a set of reasons to correlate with this metric [1].

What are the Otel datatypes of these? I assume histograms for the first two, and counts for the last two? Both of these should only ever go up, I assume.

Yes, this is correct. http dureation & response size use histograms and reason & impression use counters

feature_flag.flagd.impression -> I think this can now be pluralized feature_flag.flagd.impressions, since you have the unit {impression} From the spec:

Thank you, I will update my proposal

About the request.duration and response.size I'm not sure if we need to redefine them under the feature_flag namespace. Couldn't Flagd simply adopt the HTTP semantic conventions for these, since they are basically the same? That's what HTTP client instrumentations for example do.

Good suggesion, are you proposing to use flagd namespace and use something like flagd.http.server.duration ? If this is semconv compliant, I think we can choose this naming convension.

[1] - https://github.com/open-feature/spec/blob/main/specification/types.md#resolution-details

@joaopgrassi
Copy link

joaopgrassi commented Jun 23, 2023

Good suggesion, are you proposing to use flagd namespace and use something like flagd.http.server.duration ? If this is semconv compliant, I think we can choose this naming convension.

Ah no, I'm proposing that Flagd re-uses the HTTP metrics defined in https://github.com/open-telemetry/semantic-conventions/blob/main/specification/metrics/semantic_conventions/http-metrics.md#semantic-conventions-for-http-metrics.

Not sure if that would be client or server but the goal is to use these metrics for all HTTP communications. So, flagd would emit http.server.duration and http.server.response.size. That is what all other http servers will be reporting in their own implementations. For ex, web frameworks (Spring, ASP.NET, Gin, Laravel, etc) would report these.

Essentially then you end up with only flagd specifc metrics such as impressions and reasons

@Kavindu-Dodan
Copy link
Contributor Author

Not sure if that would be client or server but the goal is to use these metrics for all HTTP communications. So, flagd would emit http.server.duration and http.server.response.size

Wouldn't this create metric name conflicts at OTel backend (ex:- Prometheus) 🤔 Or do they differentiate metrics based on service naming and other attributes ?

@joaopgrassi
Copy link

joaopgrassi commented Jun 26, 2023

When converting from OTLP to Prometheus, the spec offers guidance that should help in this case. Instrumentation scope and version become labels while resource attributes (e.g., service.name) become this target_info metric

So it should work just as fine - they will split by say instrumentation scope = flagd and get the metrics they are after. @pirgeo maybe you have more input here, in case I missed something?

For Dynatrace for example, instrumentation scope+version and a few resource attributes (like service.name) are also added automatically to all ingested metrics as dimensions.

@Kavindu-Dodan
Copy link
Contributor Author

Kavindu-Dodan commented Jun 27, 2023

@joaopgrassi thank you for the explanation. I think my doubt is clear and I learned a few more OTel details thanks to this discussion.

For the reference of others, Instrumentation Scope [1] is already set to flagd and this is done at the meter setup phase [2]. We can further add the version as part of this improvement.


With this agreement and understanding, let's agree on the following metric naming & unit changes,

  • Instrumentation Scope: flagd
  • Version: current flagd version
Old New Unit
http_request_duration_seconds http.server.duration s
http_response_size_bytes http.server.response.size By
impressions feature_flag.flagd.impression {impression}
reasons feature_flag.flagd.evaluation.reason {reason}

[1] - https://opentelemetry.io/docs/specs/otel/glossary/#instrumentation-scope
[2] - https://github.com/open-feature/flagd/blob/main/core/pkg/telemetry/metrics.go#L126

@Kavindu-Dodan Kavindu-Dodan self-assigned this Jun 29, 2023
@Kavindu-Dodan Kavindu-Dodan removed the Needs Triage This issue needs to be investigated by a maintainer label Jun 29, 2023
Kavindu-Dodan added a commit that referenced this issue Jul 5, 2023
## This PR

Fixes #712 

Refer the following table for metrics renames and their units,


 |  Old | New  | Unit |
|---|---|---|
| http_request_duration_seconds  |  http.server.duration | s |
| http_response_size_bytes | http.server.response.size | By |
| http_requests_inflight |http.server.active_requests | {request} |
| impressions | feature_flag.flagd.impression| {impression} | 
| reasons | feature_flag.flagd.evaluation.reason| {reason} |

### Validation

This change was validated against an OTel setup. Consider following
screen captures,

![Screenshot 2023-07-04 at 1 16 02
PM](https://github.com/open-feature/flagd/assets/8186721/5aba3103-e3fc-42a7-a396-c78999d22b05)

![Screenshot 2023-07-04 at 1 15 41
PM](https://github.com/open-feature/flagd/assets/8186721/0ea196f8-fcb4-4e8c-86eb-c6774de3e2f3)

![Screenshot 2023-07-04 at 1 14 53
PM](https://github.com/open-feature/flagd/assets/8186721/c285d78d-525a-4e40-bfe2-895a85d96cd6)

![Screenshot 2023-07-04 at 1 14 40
PM](https://github.com/open-feature/flagd/assets/8186721/e826ed8c-6056-4fd4-b206-d76592d8d484)

![Screenshot 2023-07-04 at 1 14 18
PM](https://github.com/open-feature/flagd/assets/8186721/30c959af-197b-405c-b303-c4dd59807311)

Signed-off-by: Kavindu Dodanduwa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants