Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMetrics SparseHistogram/NativeHistograms #237

Open
hdost opened this issue Jan 8, 2022 · 5 comments
Open

OpenMetrics SparseHistogram/NativeHistograms #237

hdost opened this issue Jan 8, 2022 · 5 comments

Comments

@hdost
Copy link

hdost commented Jan 8, 2022

I know that in both the Prometheus project (sparsehistograms) are being worked on and also in OpenTelemetry they've been implemented as prototype in at least the Java library open-telemetry/opentelemetry-java#3550. Is there any thoughts to have them be added to OpenMetrics? Would that be a v1.1 ?
There's great value to be sure. Just not sure if they would just be transported as histograms but with different labels.

@fstab
Copy link

fstab commented Jan 8, 2022

I have an experimental implementation of Prometheus sparse histograms on my TODO list (I actually started playing with it on my local machine). However, this is currently an experimental feature in Prometheus and not part of the OpenMetrics standard. It does not make sense to make this an official feature of the Java client library when it is not yet an official feature of the Prometheus server.

Oh, sorry, I confused the Github repos and thought you were asking about adding sparse histograms to client_java :)

@brian-brazil
Copy link
Collaborator

Anything like that would be after 1.0.0, though clients would still be required to provide the same data in equivalent 1.0.0 format.

@beorn7
Copy link
Collaborator

beorn7 commented Feb 15, 2022

Let me piggyback on this issue to document the current state of the Prometheus Sparse Histogram effort and the requirements I see for OpenMetrics.

First of all: While the Prometheus Sparse Histogram are still in flux (still working on the PoC), the remaining exploration work is mostly on the querying side (PromQL handling, JSON representation for the API). On the exposition side, things have stabilized enough so that OpenMetrics can start working on how to integrate Sparse Histograms. No hard guarantees, of course, that not even small details might change, but the big picture is pretty clear by now.

Prometheus, for its PoC, simply piggybacked on the old protobuf format, since it was easy to extend, and the plumbing was still in place in client_golang. There is no representation for Sparse Histograms in the Prometheus text format (and obviously none in OpenMetrics). (You can request the content type application/vnd.google.protobuf;proto=io.prometheus.client.MetricFamily;encoding=text, though, to get a text rendering of the protobufs.)

Obviously, this was a pragmatic choices to quickly get to a working PoC. There are no plans to resurrect the old Prometheus protobuf format. Protobuf itself, however, lends itself quite nicely to the encoding of the data-heavy Sparse Histograms. So it should be discussed if protobuf should and can be leveraged in a similar way in a future OpenMetrics version supporting Sparse Histograms.

Broadly, I see the following issues that need to be addressed:

  • Does the PoC protobuf representation of Sparse Histogram fit into the current OM protobuf spec or are there more involved changes required?
  • Should Sparse Histogram be fully representable in the text version of OM at all? (Alternatively, the text format could feature a simplified version of a Histogram, possibly even in the same way as the old Histograms.)
  • If yes, should the text representation focus on human readability or encoding/decoding/bandwidth efficiency?
  • Sparse Histograms are expected to get extended with other bucket schemas in the future (e.g. linear buckets, custom buckets, log-linear buckets). Those future schemas would have a different schema code (outside of the range that is currently valid, i.e. between -4 and 8) and possibly additional fields (e.g. to define custom-defined bucket boundaries). OM needed some form of compatible extensibility to not require a new major version each time a new bucket schema is added.
  • How to deal with exemplars? (Providing one per bucket will often be way too many due to the high resolution of the Sparse Histograms.)

For more detailed considerations, see the following two sections in the Sparse Histogram design doc:

@beorn7
Copy link
Collaborator

beorn7 commented May 20, 2022

As an additional note: Two things aren't represented yet in the experimental protobuf extension but will need to be in the exposition format eventually:

  1. Gauge Histogram – OpenMetrics already has that for conventional histograms, so that will be a natural fit.
  2. “Float Histogram“ i.e. a histogram with floating point values for the buckets and the count of observations.

The latter might appear unneeded, but K8s recently ran into a use case (counting time in the buckets, which they turned into an integer problem by counting nanoseconds, but it would be nicer to count seconds as floats). It also happens inside PromQL when rateing a sparse histogram. Such a float histogram can be exposed again in federation.

@beorn7
Copy link
Collaborator

beorn7 commented Jun 29, 2022

#247 is a concrete idea for an experiment to play with a makeshift text representation of a Native Histogram AKA Sparse Histogram. I created a separate issue for it to not let the specialized discussion pollute what we have here.

@hdost hdost changed the title OpenMetrics SparseHistogram/ExponentialHistogram OpenMetrics SparseHistogram/NativeHistograms Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants