fix: Adding PodMonitor resources #2104

jtyr · 2022-06-16T15:36:07Z

This PR is adding PodMonitor resources into the Helm chart so Gatekeeper can be automatically monitored by Prometheus Operator or similar Prometheus-compatible monitoring system like VictoriaMetrics. Gatekeeper metrics are required by the OPA Gatekeeper Grafana Dashboard (mentioned in #2070) used with opa-scodeboard, the Prometheus exporter for OPA Gatekeeper.

codecov-commenter · 2022-06-16T15:46:51Z

Codecov Report

Merging #2104 (2679bf9) into master (d3e62d0) will increase coverage by 0.03%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2104      +/-   ##
==========================================
+ Coverage   54.35%   54.38%   +0.03%     
==========================================
  Files         111      111              
  Lines        9478     9478              
==========================================
+ Hits         5152     5155       +3     
+ Misses       3935     3933       -2     
+ Partials      391      390       -1

Flag	Coverage Δ
unittests	`54.38% <ø> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...onstrainttemplate/constrainttemplate_controller.go	`54.91% <0.00%> (-0.24%)`	⬇️
pkg/readiness/list.go	`91.17% <0.00%> (+11.76%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d3e62d0...2679bf9. Read the comment docs.

sozercan · 2022-06-16T16:59:55Z

@jtyr thanks for the PR! since these are 3rd party project integrations, https://github.com/open-policy-agent/contrib might be a more suitable place. Please see #659 for more context.

jtyr · 2022-06-17T09:51:18Z

@sozercan Thank you for you comment. I understand that you don't want to take responsibility for components that you don't develop. But Helm charts exist to provide user-friendly way of using your product. That should include support for industry-leading integrations. Gatekeeper exposes Prometheus metrics and Prometheus Operator is used everywhere where people are serious about monitoring and automation. By not supporting Prometheus Operator in your Helm chart, you are showing your power users that you don't care about them and force them to add the support over and over again by themselves.

praveenjindal62 · 2022-06-21T10:30:58Z

@jtyr Why not use ServiceMonitor instead of PodMonitor ? Many popular charts I have seen are exposing metrics using ServiceMonitor and I guess this make more sense to look for service which is already exposing ports on Pods then to create another object which does some amount work that Service already doing (i.e. keep looking for new pods and create endpoints). Although PodMonitors make more sense where ServiceMonitor are not at all right options, for ex - if only some pods of a deployment or sts or daemonset need to be used to expose metrics. But that is not the case with gatekeeper. Also, I know that current helm charts not exposing metrics port on pod through service but those can be modified easily to do same. Let me know your thoughts?

@sozercan I completely agree with @jtyr comment. Helm charts are exposing metrics through a separate port but there is no provision to expose those metrics to Prometheus out of the box. I would also vote to add ServiceMonitor/PodMonitor to these charts irrespective of these being handled by third party which would increase the overall usability and developer experience as well. Third party objects can be added as disabled by default so users are not impacted.

jtyr · 2022-06-21T15:39:22Z

@praveenjindal62 I have chosen to use PodMonitor as there is no Service for the Audit Controller and the Service for the Controller Manager doesn't expose the metrics port. At the end, ServiceMonitor and PodMonitor provides the same result. So unless there is a reason for the non-existing Service to be created and the existing Service to be modified with the metrics port, there is no reason to choose ServiceMonitor over PodMonitor.

ritazh · 2022-06-22T06:42:33Z

Would creating another helm chart that references the Gatekeeper chart as a dependency help in this case? https://helm.sh/docs/helm/helm_dependency/ That way there's a separation of concerns while users benefit from the added features.

jtyr · 2022-06-22T11:34:38Z

@ritazh That's exactly how we are doing it now because the PodMonitor resources are not part of the gatekeeper chart and that's exactly the reason why I have created this PR so we don't have to maintain an extra chart only to be able to monitor Gatekeeper. I can imagine plenty of users are doing the same thing and by accepting this PR, you will make all of them very happy.

sozercan · 2022-06-27T23:42:16Z

@jtyr since this can be a separate chart with a gk chart dependency, would maintaining this chart in https://github.com/open-policy-agent/contrib work? reasoning is these resources will change over time, and since there are no tests, it will eventually be broken since core maintainers are not familiar with these resources to be able to maintain these in the long term.

jtyr · 2022-06-28T09:55:05Z

@sozercan Creating additional Helm chart would work but it brings the burden of updating the dependency version with every GK release. It would be much easier to have the support for PodMonitor built into the gatekeeper chart. The installation of PodMonitor resources is disabled by default so even if there would be an an issue with resource definition, most people won't notice it. On top of that, any breaking changes to the PodMonitor resource must include change of the apiVersion of the CRD. Those changes are advertised by a deprecation message for several Kubernetes release cycles before they are removed permanently. That gives plenty of time to rise a PR to reflect this change here.

mrueg · 2022-06-28T11:16:54Z

From my personal experience these servicemonitor/podmonitor objects are pretty static and there's not a lot of change to be expected soon. They are also quite common in clusters with prometheus-operator enabled.
If it helps the conversation, they could stay disabled by default and the Readme could mention that they are "alpha" or "community" supported resources.

maxsmythe · 2022-06-30T01:50:17Z

Ultimately this PR touches on a governance principle that is necessary to keep the project maintainable. As a rule, Gatekeeper does not introduce third-party infrastructure dependencies since doing so risks increasing the maintenance burden of the project over time, may lead to incompatibilities with other users' infrastructure, and opens up thorny questions of which projects "deserve" inclusion and which don't, a question we don't have the overhead to handle.

brings the burden of updating the dependency version with every GK release

is an excellent example of the complexity we're trying to avoid. Prometheus configurations are very stable, but there are other solutions, such as the OpenTelemetry collector. Should that also be supported? If not, why are Prometheus users more privileged? OpenTelemetry is a CNCF project, should that count for something? Now we'd need to track compatibility for 2 projects. We'd also probably need to bundle in cert-manager, so that's at least 3 projects, maybe more.

To provide a truly streamlined experience, we'd also need to support multiple permutations of how these projects can be used. Frankly we don't have the capacity, and since there are already solutions for this problem, a better investment of development time for our users would be to spend it working on bugs and/or features specific to the project.

You mention we serve Prometheus metrics, which is a fair point, but:

If we didn't serve metrics, users would be unable to use Prometheus with the project. Lack of OOTB config does not prevent usage.
We serve metrics via OpenCensus, which allows us to export metrics via several protocols
There is a difference between providing opportunities for interoperability between third-party projects and being opinionated as to how those third-party projects should be configured/deployed
We take a similar stance with cert-manager where we aim to be compatible, but not prescriptive in how it is deployed

We are definitely open to making interoperability easier where possible. This PR, which would have made it easier to use cert-manager with the Helm chart (which we couldn't merge due to the lack of DCO and the author no longer responding), is a good example of a neutral compromise -- it makes Gatekeeper more configurable in ways that cert-manager requires without assuming cert-manager.

I'm wondering if a similar approach could be employed here? Is there a need to make it easy to deploy additional resources alongside the chart?

jtyr · 2022-07-05T08:40:45Z

I understand your reasons, @maxsmythe. I think that the solution could be to make the support for all those third-party resources (Porometheus, CertManager, ...) disabled by default and document that they are community driven so if something doesn't work, the community must fix it. Then if something stops working, the community will create a PR to fix it, so you can focus on your GK development only. I believe this would be better solution than providing no support at all and force many people to come up with their own solution.

sozercan · 2022-07-05T21:01:03Z

@jtyr is there a reason you wouldn't want this to be part of https://github.com/open-policy-agent/contrib which are community-supported integrations? We can link to known community-supported integrations from GK docs so folks can find these.

jtyr · 2022-07-07T10:28:02Z

I have explained reasons for that in one of my comments above. Basically there are two main reasons:

The contrib repo doesn't work like Helm registry. We could make charts directory there and create GitHub Workflows to release charts from that directory using this action.
Even if the contrib repo would work like Helm registry, the Helm chart that would wrap the GK Helm chart (have it as the dependency) would have to be updated with every time new GK Helm chart is released to keep the dependency version up-to-date. That's why it would be more practical to have thePodMonitor resources directly in the GK Helm chart.

Signed-off-by: Jiri Tyr <[email protected]>

thomasmckay · 2022-07-07T11:55:29Z

2. Even if the `contrib` repo would work like Helm registry, the Helm chart that would wrap the GK Helm chart (have it as the dependency) would have to be updated with every time new GK Helm chart is released to keep the dependency version up-to-date. That's why it would be more practical to have the`PodMonitor` resources directly in the GK Helm chart.

This does not seem like a great burden and simply puts the responsibility to do this onto the community users. A GK release checkbox could be to open an issue against contrib to update the version.

stale · 2022-09-05T13:59:26Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

JBOClara · 2022-09-20T12:15:50Z

Could we reopen ?

jtyr force-pushed the jtyr-podmon branch from f5b2be8 to 2679bf9 Compare June 17, 2022 09:40

jtyr force-pushed the jtyr-podmon branch 2 times, most recently from b088352 to 5b91be4 Compare June 27, 2022 15:38

fix: Adding PodMonitor resources

6993349

Signed-off-by: Jiri Tyr <[email protected]>

jtyr force-pushed the jtyr-podmon branch from 5b91be4 to 6993349 Compare July 7, 2022 10:28

stale bot added the stale label Sep 5, 2022

stale bot closed this Sep 20, 2022

LeoQuote mentioned this pull request May 8, 2023

add gatekeeper chart douban/charts#48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Adding PodMonitor resources #2104

fix: Adding PodMonitor resources #2104

jtyr commented Jun 16, 2022

codecov-commenter commented Jun 16, 2022 •

edited

Loading

sozercan commented Jun 16, 2022

jtyr commented Jun 17, 2022

praveenjindal62 commented Jun 21, 2022

jtyr commented Jun 21, 2022

ritazh commented Jun 22, 2022

jtyr commented Jun 22, 2022

sozercan commented Jun 27, 2022

jtyr commented Jun 28, 2022

mrueg commented Jun 28, 2022

maxsmythe commented Jun 30, 2022 •

edited

Loading

jtyr commented Jul 5, 2022

sozercan commented Jul 5, 2022

jtyr commented Jul 7, 2022

thomasmckay commented Jul 7, 2022

stale bot commented Sep 5, 2022

JBOClara commented Sep 20, 2022

fix: Adding PodMonitor resources #2104

fix: Adding PodMonitor resources #2104

Conversation

jtyr commented Jun 16, 2022

codecov-commenter commented Jun 16, 2022 • edited Loading

Codecov Report

sozercan commented Jun 16, 2022

jtyr commented Jun 17, 2022

praveenjindal62 commented Jun 21, 2022

jtyr commented Jun 21, 2022

ritazh commented Jun 22, 2022

jtyr commented Jun 22, 2022

sozercan commented Jun 27, 2022

jtyr commented Jun 28, 2022

mrueg commented Jun 28, 2022

maxsmythe commented Jun 30, 2022 • edited Loading

jtyr commented Jul 5, 2022

sozercan commented Jul 5, 2022

jtyr commented Jul 7, 2022

thomasmckay commented Jul 7, 2022

stale bot commented Sep 5, 2022

JBOClara commented Sep 20, 2022

codecov-commenter commented Jun 16, 2022 •

edited

Loading

maxsmythe commented Jun 30, 2022 •

edited

Loading