Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone OpenMetrics Text Parser #169

Open
rakyll opened this issue Nov 23, 2020 · 10 comments
Open

Standalone OpenMetrics Text Parser #169

rakyll opened this issue Nov 23, 2020 · 10 comments

Comments

@rakyll
Copy link

rakyll commented Nov 23, 2020

OpenMetrics, when adopted widely, will be the export format of various endpoints that need to be auto-discovered and scraped. Currently, there isn't an official parser library for OpenMetrics other than the ongoing work on Prometheus. Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series. This sometimes makes it hard to write new tools to discover, parse and ingest metrics. To avoid some of these problems, I propose that we should build a standalone OpenMetrics parser so the adoption is not limited.

Challenges

A standalone parser would be useful in the following cases:

  • When metrics are needed to be scraped by intermediate tools. There are a variety of tools that want to rely on user metrics for better decisions. For example, load balancers can rely on custom user metrics. Autoscalers can dynamically scale the number of replicas based on user metrics.
  • In large clusters, aggregation of collected metrics is a common approach before reporting them. Collecting OpenMetrics metrics, aggregating them in a custom intermediate component and exposing them to metric collection backend is a path we should enable.
  • When running very tiny workloads or while being in limited compute environments, running collection backends is not always an option. It'd be good to be able to parse the metrics to lightweight intermediate components to temporary store, aggregate and export to metrics collection backend is not possible.

Proposal

Let’s create a standalone parser library that will parse the text format to Protocol Buffers. Consumers of OpenMetrics can rely on the parser library if proto exposition is not available. If metrics are exposed in protos, scraper should prefer to fetch them in proto instead of parsing the text format.

Alternatives Considered

  • Building a standalone parser library based on components from the Prometheus source code. But maintainability-wise, this is not the right approach and it may require forks of unexported components. Forking the Prometheus project should be discouraged.
  • Making protobuf a mandatory export format but this would limit the adoption of OpenMetrics.
@RichiH
Copy link
Member

RichiH commented Nov 23, 2020

Datapoint/FYI: Making protobuf mandatory has been discussed, in particular for large-scale deployments, and discarded for precisely that reason. Anyone with the scaling needs will be able to implement proto, most likely on both ends.

That being said, a standard parser makes sense.

I am unsure if this should live in prometheus/ or openobservability/ and can see good reasons for either. As long as it's done in consensus with Prometheus-team, it's just a name in an URL anyway.

@juliusv
Copy link
Member

juliusv commented Nov 23, 2020

My 2c from the peanut gallery: Sounds good, and I'm for putting it into the OM org vs. the Prometheus org. If you compare it with https://github.com/open-telemetry, OT also has reference client library implementations for various languages, and a similar thing could make sense for OM as well. Then OM could eventually become a home for OM parsers (and serializers) for all kinds of languages.

@SuperQ
Copy link
Member

SuperQ commented Nov 23, 2020

Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series.

Can you be more specific about what libraries you're talking about here? Prometheus specifically does not use or store deltas, and discourages the use of deltas in favor of raw data. The Prometheus design is very much centered around collecting and storing raw data without interpretation.

@rakyll
Copy link
Author

rakyll commented Nov 23, 2020

@SuperQ, I was referring to the scraping library from Prometheus. It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager. The non-Prometheus scrapers end up implementing a state machine to handle the deltas.

@SuperQ
Copy link
Member

SuperQ commented Nov 23, 2020

I'm pretty sure that appends samples, not deltas. In OpenMetrics terms this would be a MetricPoint. This is a raw value, not a delta.

@juliusv, can you answer this question?

@juliusv
Copy link
Member

juliusv commented Nov 23, 2020

The scraping layer in https://github.com/prometheus/prometheus/blob/master/scrape/scrape.go does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

However, the core OM parser (not the full scraper) is at https://github.com/prometheus/prometheus/blob/master/pkg/textparse/openmetricsparse.go and AFAIK doesn't require any such state tracking at all. So I hope it would be relatively easily reusable outside of Prometheus. Still, it can make sense to move it to this org IMO. Or fork it here, in case we need a slightly different implementation in Prometheus.

@brian-brazil
Copy link
Contributor

brian-brazil commented Nov 23, 2020

Currently, there isn't an official parser library for OpenMetrics
That being said, a standard parser makes sense.

There is already a standard standalone parser for OpenMetrics. https://github.com/prometheus/client_python is the reference implementation for both parsing and exposition, and is already used outside of Prometheus.

It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager.

That's an internal library of Prometheus, and not suitable for general usage as it's is very tightly bound to the Prometheus TSDB for performance reasons. I would not recommend using it as a base for general OpenMetrics parser, among other things it doesn't do the validation and data modelling that an official OpenMetrics parser should.

@rakyll
Copy link
Author

rakyll commented Nov 23, 2020

does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

@brian-brazil
Copy link
Contributor

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

Yeah, there's lot of specialised logic in there.

For Go I'm currently expecting that we'll end up with a full parser and exposition in https://github.com/prometheus/common/tree/master/expfmt, which is where the official Prometheus text format ones live - Prometheus itself stopped using the official Prometheus text format parser in 2.0.0. Prometheus will want a full parser for promtool.

Java exposition is next on my todo list, but I'd be happy to review a PR to add a full parser into common. We'll probably want to share the validation between the proto and text formats.

@SuperQ
Copy link
Member

SuperQ commented Nov 23, 2020

@rakyll Thanks for the clarifications. I agree, it's a good idea to get some "easy to import" libraries to make interoperability easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants