-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup monitoring components for infra clusters #235
Comments
Great idea! |
/area test-infra |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
Rotten issues close after 30d of inactivity. /close Send feedback to tektoncd/plumbing. |
@tekton-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
/remove-lifecycle rotten |
@vdemeester: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
/remove-lifecycle stale |
I'm gonna tentatively assign this to myself, since i'm looking into tektoncd/pipeline#540 theoretically ill at least look into setting up some monitoring for performance testing, maybe! /assign |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
Rotten issues close after 30d of inactivity. /close Send feedback to tektoncd/plumbing. |
@tekton-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/area roadmap |
Expected Behavior
We should monitor the status of the various CI/CD services and we should be able to display metrics about the status of the services using grafana, following the example of
https://monitoring.prow.k8s.io/d/8P7-1J8Wz/boskos-server-dashboard?orgId=1 and https://github.com/kubernetes/test-infra/tree/201c7788b244ab2fc3efae7249fb939223ef6e1e/prow/cluster/monitoring
Things that we need to monitor are:
prow
clusterdogfooding
clusterWe should display metrics from services where available:
We'll need prometheus and grafana deployed somewhere.
We may be able to use one instance across clusters, at least for grafana.
We might want alertmanager too, so we could alert build-cop on slack when something is broken.
Actual Behavior
We don't have any monitoring in place
The text was updated successfully, but these errors were encountered: