-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotation convention #54
Comments
@devopsjonas The fields Regarding The field |
Oh I see. So our use case is a bit different from yours. Yours is rendering in Ceph Manager UI, is that correct? This is how it currently looks: Would you guys be ok if added |
At the moment these alerts are being used in OCP UI. I would suggest these fields to be added as and when you downstream this. May be having fields upstream with blank values is fine. @cloudbehl @anmolsachan @umangachapagain sugestions? |
@shtripat what is OCP UI? could you link me to github repo or project repo? Also does this project plan to follow Prometheus monitoring mixin structure? Structure:
Could we add |
@devopsjonas OCP is OpenShift Container Platform. Yes, you can add Grafana Dashboards and Prometheus rules. The only point these should be generic enough so community can use it. I think we can take care for that in reviews. |
@cloudbehl could you link me to that OCP UI code? And how to compile & run it? I would like to try my PR'd alerts against it, just to see how they are rendered. That said, we shouldn't be doing annotations that only OCP dashboard understands. Atleast in IMO human readable alerts should follow convention described in https://blog.pvincent.io/2017/12/prometheus-blog-series-part-5-alerting-rules/#provide-context-to-facilitate-resolution or something similiar. |
The OCP UI can found here:
The idea for the repo is have a generic code that can be used by community for the ceph alert and dashboard. Right now we are testing it with kubernetes and OpenShift(OCP) but It is not specific to the OCP use case I can say. I agree to a point that description and message are kind of same and can be worked upon. But our idea behind the message and description is that message will have short description and the description will have a detailed message of what happened where and what you can look up to correct the problem.
Ack! I think |
Ok, so mixin needs to be generic enough. But then arguments you wrote in #54 (comment) doesn't apply, as all that you write is that you need them in OCP UI. This projects needs to enable you guys to build OCP UI, but don't adapt to it's conventions. Right now our model used in https://github.com/devopyio/ceph-monitoring-mixin is a bit different, as we ran these alerts on top of Prometheus, Alertmanager, Email and Zabbix. I really think we should meet somewhere in the middle with our annotation & label convention. We can drop considering We didn't come with our proposed annotation convention, we just blatenly copied it from the internet as we think it's really good. Here are the fields we propose to change: Message and DescriptionBut we should really rethink In your case I think it's true too: Example:
In this case In a lot of cases it just adds https://github.com/ceph/ceph-mixins/blob/master/alerts/node.libsonnet#L18 IMO in 99% cases we could merge these 2 together into single severity and severity_levelHow is Can we just use a Here are some examples:
https://github.com/ceph/ceph-mixins/blob/master/alerts/utilization.libsonnet#L30 storage_typeWhat does https://github.com/ceph/ceph-mixins/blob/master/config.libsonnet#L19 Is that for So for now we will change our PRs to follow your convention. But please 🙏 reconsider annotations. I know open source is hard sometimes and we think you guys did an amazing job of doing these alerts. |
The intention is very clear to use
This was specific requirement from UI to mark critical alerts as
This was introduced for easy filtering of storage specific alerts from the full list of alerts from prometheus. In absence of this annotation, what other label/annotation could be used to filter the storage alerts? |
I think we are duplicating info in our current annotations:
For example
message
anddescription
has almost the same information.Maybe instead we just do
description
?also what is
storage_type
field, why does it matter during solving the alert?I think if we would end up with something that looks like https://blog.pvincent.io/2017/12/prometheus-blog-series-part-5-alerting-rules/#provide-context-to-facilitate-resolution
would be best.
His example
Also I think
serverity
should be a label not an annotation. Alertmanager can only use labels for alert routing https://prometheus.io/docs/alerting/configuration/#routeSo if it's an annotation then you can't send warning alerts to email, critical alerts to opsgenie/pagerduty
The text was updated successfully, but these errors were encountered: