-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Istio's Prow should use the new upstream monitoring and alerting stack. #1774
Comments
@cjwagner I'd like to help with this, and according to Travis this is mostly a copy and paste to see if it works more or less out of the box, but had a few questions:
I can also move over the Istio-specific stuff from https://github.com/kubernetes/test-infra/blob/master/velodrome/config.yaml into a new |
Sure, that would be appropriate.
You will need to generate a new password. I'm not certain how the Istio handles secrets, but if it is the same as the k8s project then you'll need to share the credentials with oncall (or perhaps have them generate them in this case) and they can load them into the cluster and back them up in a secret store.
(Assuming you are talking about
Yes, you should update the ingress to work with your own domain instead of
Istio uses CODEOWNERS instead of OWNERS so switch to CODEOWNERS. Please don't include me as an owner of this, I don't intend to maintain this instance.
There is some k/t-i specific configuration. In particular, some of the alerts are only appropriate or are customized for k/t-i. e.g. We explicitly ignore Tide errors from kubeflow since they have been misusing Tide for a long time and the errors are accepted. https://github.com/kubernetes/test-infra/blob/af68b02cf15c086db2b33997784b93a096a7c208/prow/cluster/monitoring/mixins/prometheus/tide_alerts.libsonnet#L36
Yes, if the metrics and dashboards currently included in the monitoring stack are insufficient for your needs it would probably be best to work on adding new dashboards as a follow up. Please consider if any metrics/dashboards would be generally useful to Prow (#1638 sounds like a yes) and contribute the changes upstream to k/t-i if so to allow all Prow instances to benefit.
This should be deployed as part of the deploy-prow job via a new make target: test-infra/prow/config/jobs/test-infra.yaml Lines 15 to 31 in 2dfc3ef
I would recommend avoiding touching velodrome at all unless you are migrating things off velodrome and onto the new monitoring stack. Hope this helps! |
@scottilee thanks for taking this on! |
https://github.com/kubernetes/test-infra/tree/master/prow/cluster/monitoring#monitoring
We'll get lots of nice preconfigured graphs and alerts for free. e.g. https://monitoring.prow.k8s.io
I think we can pretty easily move over anything from https://velodrome.istio.io that is still used too so that we can turn down velodrome and just have a single monitoring dashboard.
cc @clarketm @fejta @geeknoid @howardjohn
The text was updated successfully, but these errors were encountered: