Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native histogram support for histogram metrics #9971

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

rabenhorst
Copy link
Contributor

This PR exposes native histogram configuration for histogram metrics as flags. Without setting the flags, behavior won't change. Native histograms are only scraped when both Prometheus and the client support it and have it enabled, otherwise classic histograms will be scraped.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • CVE Report (Scanner found CVE and adding report)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation only

Which issue/s this PR fixes

How Has This Been Tested?

Ran ingress-nginx in a local Kubernetes cluster using make dev-env and scraped it with Prometheus with native histograms enabled and disabled.

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have added unit and/or e2e tests to cover my changes.
  • All new and existing tests passed.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented May 17, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: rikatz / name: Ricardo Katz (ca29f06)
  • ✅ login: rabenhorst / name: Sebastian Rabenhorst (fcbcc7b)

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 17, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

Welcome @rabenhorst!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @rabenhorst. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 17, 2023
@k8s-ci-robot k8s-ci-robot requested a review from cpanato May 17, 2023 08:36
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 17, 2023
@rabenhorst
Copy link
Contributor Author

/auto-cc @ElvinEfendi

@github-actions
Copy link

github-actions bot commented Jul 2, 2023

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 2, 2023
@k8s-triage-robot
Copy link

The lifecycle/frozen label can not be applied to PRs.

This bot removes lifecycle/frozen from PRs because:

  • Commenting /lifecycle frozen on a PR has not worked since March 2021
  • PRs that remain open for >150 days are unlikely to be easily rebased

You can:

  • Rebase this PR and attempt to get it merged
  • Close this PR with /close

Please send feedback to sig-contributor-experience at kubernetes/community.

/remove-lifecycle frozen

@k8s-ci-robot k8s-ci-robot removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 2, 2023
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 22, 2023
@Cheshirez
Copy link

Assuming a huge metrics cardinality from ingress controllers, exposing native histogram would help a lot.

@longwuyuan
Copy link
Contributor

@Cheshirez for the readers who know nothing about the histograms and the variants like native or classic, this PR is hard to wrap one's head around.

What is required is the screesnhots, data and explaining all together, copy/paste here for everyone to read and understand. That will likely cause a thought on what the improvement is and how its relevant to what group of users. It also requires to help out with info on the impact of maintaining & supporting. That kind of insight is missing here. It requires the rare prometheus maintainer level experts and they are hard to come by, to spend their time here.

Any help is appreciated.

@tiithansen
Copy link

@rabenhorst Any plans to continue with this PR?

@rabenhorst
Copy link
Contributor Author

@rabenhorst Any plans to continue with this PR?

Sry I was on holiday. Yes I can work on this. I'm not a Prometheus maintainer, but added (partial) native histogram support for Thanos and know my way around in Prometheus.

@longwuyuan I will rebase and update the PR description accordingly in the coming days.

@rikatz
Copy link
Contributor

rikatz commented Aug 22, 2024

@rabenhorst are you still willing to work on this?
@longwuyuan should we proceed once this is rebased?

@longwuyuan
Copy link
Contributor

@rikatz ,if rebased and passing tests, then its ok to accept this because optional observabiity flag being added here helps because it is optional.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2024
Copy link

netlify bot commented Aug 23, 2024

Deploy Preview for kubernetes-ingress-nginx canceled.

Name Link
🔨 Latest commit ca29f06
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-ingress-nginx/deploys/66c8a5bb405f4700085e5759

@longwuyuan
Copy link
Contributor

Hi Sebastian, thanks for the contribution. Hoping you will copy paste some screenshots of histograms resulting from your changes. Helps get perspective. Also requesting you squash commits kindly.

@rabenhorst
Copy link
Contributor Author

rabenhorst commented Aug 23, 2024

Hi Sebastian, thanks for the contribution. Hoping you will copy paste some screenshots of histograms resulting from your changes. Helps get perspective. Also requesting you squash commits kindly.

I will. There also will be a talk about our migration to native histograms on PromCon https://promcon.io/2024-berlin/talks/shopifys-journey-from-conventional-to-native-histograms/ if you are interested in the wider topic of native histograms.

Will squash all commits now.

Is the failing lint expected?

Signed-off-by: Sebastian Rabenhorst <[email protected]>

Fixed flag

Added cli docs

Fxi cli args doc

Fxi cli args doc

Fxi cli args doc

Fxi cli args doc

Revert "Fxi cli args doc"

This reverts commit b0dd2de6e032cefbdb06cead85b0b9be16abcd22.

Revert "Fxi cli args doc"

This reverts commit 6afc0f7a1c9bfba4ee75dbb99be1b9fbba4df156.

Revert "Fxi cli args doc"

This reverts commit 1f6c408e1220745bd1279b1aeaf854c748adf754.

Revert "Fxi cli args doc"

This reverts commit 68beccbd50be56b4f8aa461b6c3685f927e02413.

Fix cli args docs

Fix cli args docs

Fix cli args docs
@longwuyuan
Copy link
Contributor

That is awesome. Will try to sync. I hope you will put in some brief description in docs about how native histograms are different and their benefits. Thanks again for this contribution being optional config.

@longwuyuan
Copy link
Contributor

I am on phone. Will get on computer and comment on lint.

@longwuyuan
Copy link
Contributor

@rikatz linter caught 2 deprecations. Needs your comment

Error: internal/task/queue.go:39:8: SA1019: workqueue.RateLimitingInterface is deprecated: Use TypedRateLimitingInterface instead. (staticcheck)
  	queue workqueue.RateLimitingInterface

@tao12345666333 PTAL

@rikatz
Copy link
Contributor

rikatz commented Aug 23, 2024

Yeah these linter comments are new and unrelated. I am fixing those on another PR, let me see if I can expedite these fixes and then merge this one

@rikatz
Copy link
Contributor

rikatz commented Aug 23, 2024

#11853 fixing the linter here, then we can rebase and re-run this test

@rikatz
Copy link
Contributor

rikatz commented Aug 23, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 23, 2024
@rikatz
Copy link
Contributor

rikatz commented Aug 23, 2024

/lgtm
/approve
Thank you

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rabenhorst, rikatz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2024
@k8s-ci-robot k8s-ci-robot merged commit ffee96c into kubernetes:main Aug 23, 2024
27 checks passed
This was referenced Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/docs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants