Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Allow any unicode characters in the TagValue #71

Merged

Conversation

semistrict
Copy link
Contributor

Prometheus permits any characters:
https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels

In gRPC propagation, the tags are binary encoded as UTF-8 strings,
so there should be no problem there. For HTTP, we can use mime
encoding.

@semistrict
Copy link
Contributor Author

I feel that we are being unreasonably US-centric in restricting tag values to latin characters. It is very easy to imagine useful tags that contain non-latin character, for example the names of Chinese metro areas. We should not force people to latinize all these just to use OpenCensus.

@sebright
Copy link
Contributor

/cc @dinooliva

@semistrict semistrict force-pushed the tag-value-chars branch 2 times, most recently from 2935694 to d59e999 Compare March 19, 2018 16:53
@bogdandrutu
Copy link
Contributor

I would like to hear what others think about this. I feel this should be fine.

@bogdandrutu
Copy link
Contributor

Fixes #53.

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@odeke-em
Copy link
Member

Might need a rebase.

@bogdandrutu
Copy link
Contributor

@Ramonza I would like to know if the set of characters that you propose here are accepted by few of the major backends that we try to support:

  • Prometheus - which does
  • SignalFx - ???
  • Stackdriver - ???

Copy link

@isturdy isturdy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a conclusive answer for Stackdriver labels (it just says "A variable-length string" in the proto), but filters explicitly allow UTF-8 strings. Empirically, writing a UTF-8 label value and reading it back works correctly.

@sebright
Copy link
Contributor

/cc @g-easy

@songy23
Copy link
Contributor

songy23 commented Mar 26, 2018

@bogdandrutu
FYI Stackdriver Monitoring allows label values in unicode for TimeSeries:
sd-1
sd-2

However, those label values are not selectable on their UI:
sd-3

Prometheus permits any characters:
https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels

In gRPC propagation, the tags are binary encoded as UTF-8 strings,
so there should be no problem there. For HTTP, we can use mime
encoding.
@bogdandrutu
Copy link
Contributor

I am fine with this as long as we document in all the languages that some backends may have different restrictions and users needs to be aware of this limitation.

semistrict pushed a commit to semistrict/opencensus-go that referenced this pull request Mar 26, 2018
semistrict pushed a commit to semistrict/opencensus-go that referenced this pull request Mar 26, 2018
semistrict pushed a commit to semistrict/opencensus-go that referenced this pull request Mar 26, 2018
@semistrict semistrict merged commit 213bba8 into census-instrumentation:master Mar 26, 2018
@dinooliva
Copy link

Pragmatically, we didn't allow this because of compatibility issues internal to google and because it allows fast encoding/decoding to languages that don't natively use UTF8 (e.g. Java).

More generally, we've had discussions about making TagValue an uninterpreted sequence of bytes, which probably makes more sense. Making this change requires some rollout effort so better to design the solution and do it once.

@semistrict
Copy link
Contributor Author

@dinooliva I think it would make for very strange APIs for tag values to be uninterpreted bytes. perhaps we should just remove the statement about length limit and instead of it being an error allow the user to truncate the values at the point of serializing them (when it really matters)? cc @bogdandrutu

semistrict pushed a commit to semistrict/opencensus-go that referenced this pull request Mar 27, 2018
semistrict pushed a commit to census-instrumentation/opencensus-go that referenced this pull request Mar 27, 2018
songy23 added a commit to songy23/opencensus-specs that referenced this pull request Apr 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants