Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cortex and Prometheus Remote Write exporter design #1464

Merged
merged 7 commits into from
Aug 11, 2020

Conversation

huyan0
Copy link
Member

@huyan0 huyan0 commented Jul 31, 2020

Design Proposal: Collector Cortex/Prometheus remote write Exporter

This document outlines our design proposal of the Prometheus remote write Exporter requested in #1150.

In the document, we cover the following areas:

  • Provide context on Cortex and Prometheus remote write API
  • Outline assumptions we made for this project (we are working with OTLP metrics defined in proto PR #145
  • Exporting cumulative values only for monotonic types, histogram, and summary
  • How metrics will be converted and send to the backend by the exporter
  • Discuss the factory component and configuration of the exporter, supplemented by a sample configuration
  • Cover thread-safety, shutdown behavior, and testing plan

We have referenced design principles outlined in CONTRIBUTING.md and the design of the OTLP and Prometheus exporters.

cc: @huyan0 @danielbang907 @alolita

Request review: @open-telemetry/collector-approvers @jmacd @bogdandrutu

@codecov
Copy link

codecov bot commented Jul 31, 2020

Codecov Report

Merging #1464 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1464   +/-   ##
=======================================
  Coverage   90.90%   90.90%           
=======================================
  Files         239      239           
  Lines       16692    16692           
=======================================
  Hits        15174    15174           
  Misses       1086     1086           
  Partials      432      432           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f954863...7f8529a. Read the comment docs.

Copy link
Contributor

@annanay25 annanay25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved

When adding a quantile of the summary data point to the map, the string signature should also contain a `quantile ` label that indicates the quantile value. This label should also be exported. Any summary metric that is not cumulative should be dropped.

### **2.3 Exporting Metrics**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to vendor in Prometheus remote write package here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, using the client from remote write makes sense. We are also trying to allow users to specify any header or pass in a http.Client to send requests. I will take a look at whether this is possible with remote write package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't sadly, but we should see if we can make the current client accept a custom http.Client.

https://github.com/prometheus/prometheus/blob/master/storage/remote/client.go#L125-L133

We can try adding a SetHTTPClient() method to the Prometheus client object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that change would be very helpful for our usage.

Copy link
Member

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a Prometheus and Cortex maintainer, I love this proposal! But we would be open to many more integrations and services if we rename this from Cortex to Prometheus Remote Write :)

See: https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage

exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
**Assumptions:**
Because of the gaps mentioned above, this project will convert from the current OTLP metrics and work under the assumption one of the above solutions will be implemented, and all incoming monotonic scalars/histogram/summary metrics should be cumulative or otherwise dropped. More details on the behavior of the exporter is in section 2.2.

## **2. Cortex Exporter**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this from Cortex exporter to "Prometheus Remote Write exporter"? Cortex just uses the Prom RW APIs and by renaming we'll be clearer in intent that we support Prom RW and not just Cortex.

Copy link
Member Author

@huyan0 huyan0 Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do want to emphasize the support for Cortex, but we definitely can indicate its compatibility with other remote write backends in our document. Thanks for the suggestion. Also, a question on remote write: does it send out sorted label set and how do we treat duplicate labels?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We send a list of labels. And I think its upto the upstream implementation to deal with the discrepancy. In general, in Prometheus, you can never have a label name that has 2 values, so the upstream will likely reject it with a 4xx. I am fairly certain that the labels need not be sorted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. Another related question: for Cortex to work, should our exporter meet the requirement that metric of the same type and name must have the same labels? If so, how could we maintain it in the case of multiple exporters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel gives some specification about duplicate labels. They should be eliminated before the exporter sees them. I'm not sure if the SDK specification will say they're sorted, but in the OTel Go SDK they are sorted as an efficient means of deduplication.


When adding a quantile of the summary data point to the map, the string signature should also contain a `quantile ` label that indicates the quantile value. This label should also be exported. Any summary metric that is not cumulative should be dropped.

### **2.3 Exporting Metrics**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't sadly, but we should see if we can make the current client accept a custom http.Client.

https://github.com/prometheus/prometheus/blob/master/storage/remote/client.go#L125-L133

We can try adding a SetHTTPClient() method to the Prometheus client object.

exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
@huyan0 huyan0 changed the title Add Cortex exporter design Add Prometheus Remote Write exporter design Jul 31, 2020
@huyan0 huyan0 changed the title Add Prometheus Remote Write exporter design Add Cortex and Prometheus Remote Write exporter design Jul 31, 2020
**Assumptions:**
Because of the gaps mentioned above, this project will convert from the current OTLP metrics and work under the assumption one of the above solutions will be implemented, and all incoming monotonic scalars/histogram/summary metrics should be cumulative or otherwise dropped. More details on the behavior of the exporter is in section 2.2.

## **2. Cortex Exporter**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel gives some specification about duplicate labels. They should be eliminated before the exporter sees them. I'm not sure if the SDK specification will say they're sorted, but in the OTel Go SDK they are sorted as an efficient means of deduplication.

@jmacd
Copy link
Contributor

jmacd commented Aug 4, 2020

@huyan0 Thanks! I appreciate the detailed explanation of how you can carry this out with the current state of the specification and OTLP support. 😀

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall design looks promising, some suggestions to make things a bit more generic. Having the first version only supporting cumulative is fine.

exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved
exporter/cortexexporter/DESIGN.md Outdated Show resolved Hide resolved

The following diagram shows an example of Prometheus remote write API usage, with Cortex,n open source, horizontally scalable, highly available, multi-tenant, long term storage, as a remote storage backend.

![Cortex Archietecture](https://raw.githubusercontent.com/open-o11y/opentelemetry-collector/design-doc/exporter/cortexexporter/cortex.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use relative path (./cortex.png) same for the other png. May consider to move the pngs in a sub-directory.

@bogdandrutu
Copy link
Member

@bogdandrutu
Copy link
Member

Ping :)

@huyan0
Copy link
Member Author

huyan0 commented Aug 11, 2020

Ping :)

Thanks! I have renamed the folder and used relative links for images :-)

@bogdandrutu bogdandrutu merged commit b5b185a into open-telemetry:master Aug 11, 2020
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
* Prepare for releasing v0.16.0

* Prepare CHANGELOG for v0.16.0 release

Signed-off-by: Anthony J Mirabella <[email protected]>
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
* Upgrade to golang 1.18.1

* Update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants