From 84e41804ede9d710974cabe25cc486164252a56d Mon Sep 17 00:00:00 2001 From: Pedro Santos Date: Fri, 8 Jul 2022 14:46:02 +0000 Subject: [PATCH 1/6] Update helm chart so that prometheus endpoint is now configured under 'metrics' variable. Add support for prometheus-operator custom resources: prometheusrules and servicemonitor. Chart labels updated to reflect helm recommended labelset. --- CHANGELOG.md | 4 +- chart/elastalert2/README.md | 12 +- chart/elastalert2/templates/_labels.tpl | 18 +++ chart/elastalert2/templates/_names.tpl | 60 +++++++ chart/elastalert2/templates/_tplvalues.tpl | 13 ++ chart/elastalert2/templates/config.yaml | 11 +- chart/elastalert2/templates/deployment.yaml | 32 ++-- .../templates/podsecuritypolicy.yaml | 9 +- .../elastalert2/templates/prometheusrule.yaml | 19 +++ chart/elastalert2/templates/role.yaml | 9 +- chart/elastalert2/templates/rolebinding.yaml | 9 +- chart/elastalert2/templates/rules.yaml | 9 +- chart/elastalert2/templates/service.yaml | 43 +++++ .../elastalert2/templates/serviceaccount.yaml | 9 +- .../elastalert2/templates/servicemonitor.yaml | 45 ++++++ chart/elastalert2/templates/smtp-auth.yaml | 6 +- chart/elastalert2/values.yaml | 147 +++++++++++++++++- 17 files changed, 401 insertions(+), 54 deletions(-) create mode 100644 chart/elastalert2/templates/_labels.tpl create mode 100644 chart/elastalert2/templates/_names.tpl create mode 100644 chart/elastalert2/templates/_tplvalues.tpl create mode 100644 chart/elastalert2/templates/prometheusrule.yaml create mode 100644 chart/elastalert2/templates/service.yaml create mode 100644 chart/elastalert2/templates/servicemonitor.yaml diff --git a/CHANGELOG.md b/CHANGELOG.md index 8c4ebc6f..782ba690 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,8 +3,10 @@ ## Breaking changes - When using HTTP POST 2, it is no longer necessary to pre-escape strings (should they contain control chars) from events in elastic search which are replaced by the jinja2 template. +- [Kubernetes] [Breaking] reconfigure metrics to follow prometheus operator nomenclature. `metrics` value, now control the addition of metrics endpoint (command argument), the creation of a service to expose the metrics endpoint and the (optional) creation of prometheus-operator objects: serviceMonitor and prometheurRules to match implementations of other charts. The labels of the chart have been modified, so you'll need to uninstall and reinstall the chart for the upgrade to work. - @PedroMSantosD + ## New features -- None +- [Kubernetes] Chart is now able to create a service for the metrics, and optional prometheus-operator custom resources serviceMonitor and prometheusRule. ## Other changes - Upgrade pylint 2.13.8 to 2.14.3, Upgrade sphinx 4.5.0 to 5.0.2 - [#891](https://github.com/jertel/elastalert2/pull/891) - @nsano-rururu diff --git a/chart/elastalert2/README.md b/chart/elastalert2/README.md index 186e3b8c..8fc3ee5d 100644 --- a/chart/elastalert2/README.md +++ b/chart/elastalert2/README.md @@ -96,5 +96,13 @@ The command removes all the Kubernetes components associated with the chart and | `tolerations` | Tolerations for deployment | [] | | `smtp_auth.username` | Optional SMTP mail server username. If the value is not empty, the smtp_auth secret will be created automatically. | `NULL` | | `smtp_auth.password` | Optional SMTP mail server passwpord. This must be specified if the above field, `smtp_auth.username` is also specified. | `NULL` | -| `prometheusPort` | Optional TCP port to be used to expose prometheus metrics. if set: (1) it will pass the start parameter --prometheus_port to the command, (2) it will expose said TCP port on the POD and (3) It will add the pod annotation: prometheus.io/port: value to POD, for prometheus pod service discovery to pick the metrics | `NULL` | -| `prometheusScrapeAnnotations` | Optional Dict with the flags used by prometheus SD to know the scrape path and to keep the scrapted metrics. Note that this values are only rendered if prometheusPort is set | prometheusScrapeAnnotations: {prometheus.io/scrape: "true" prometheus.io/path: "/"} | +| `metrics` | enable elastalert prometheus endpoint, add prometheus.io annotations to pod and create a service pointing to the port for prometheus to scrape the metrics | `false` | +| `metrics.prometheusPort` | if "metrics" is set to true, CP port pod will expose prometheus metrics on. | `8080` | +| `metrics.prometheusPortName` | name of the port where metrics are exposed | `http-alt` | +| `metrics.prometheusScrapeAnnotations` | if metrics are enabled, annotations to add to the pod for prometheus configuration. prometheus.io/port is also added uring the prometheusPort and prometheusPortName values | {prometheus.io/scrape: "true" prometheus.io/path: "/"} | +| `metrics.serviceMonitor.enabled` | If metrics are enabled, create a servicemonitor custom resource for prometheus-operator to detect and monitor the service with the merics endpoint | `false` | +| `metrics.serviceMonitor.labels` | labels to add to the serviceMonitor object for prometheus-operator to detect and append it to your prometheus configuration, when deployed on a different namespas as the prometheus operator | `{}` | +| `metrics.serviceMonitor.metricRelabelings` | list of prometheus metric relabeling configs to aply to scrape. Example@ drop python_gc metrics or alter pod name | `[]` | +| `metrics.prometheusRule.enabled` | If metrics are enabled, create a prometheusRule custom resource for prometheus-operator to customise scrape configuration | `false` | +| `metrics.prometheusRule.additionalLabels` | labels to add to the prometheusRule object for prometheus-operator to detect and append it to your prometheus configuration, when deployed on a different namespas as the prometheus operator | `{}` | +| `metrics.prometheusRule.rules` | group of rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Define as multiline Yaml string | `[]` | \ No newline at end of file diff --git a/chart/elastalert2/templates/_labels.tpl b/chart/elastalert2/templates/_labels.tpl new file mode 100644 index 00000000..252066c7 --- /dev/null +++ b/chart/elastalert2/templates/_labels.tpl @@ -0,0 +1,18 @@ +{{/* vim: set filetype=mustache: */}} +{{/* +Kubernetes standard labels +*/}} +{{- define "common.labels.standard" -}} +app.kubernetes.io/name: {{ include "common.names.name" . }} +helm.sh/chart: {{ include "common.names.chart" . }} +app.kubernetes.io/instance: {{ .Release.Name }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +{{- end -}} + +{{/* +Labels to use on deploy.spec.selector.matchLabels and svc.spec.selector +*/}} +{{- define "common.labels.matchLabels" -}} +app.kubernetes.io/name: {{ include "common.names.name" . }} +app.kubernetes.io/instance: {{ .Release.Name }} +{{- end -}} diff --git a/chart/elastalert2/templates/_names.tpl b/chart/elastalert2/templates/_names.tpl new file mode 100644 index 00000000..c6e0202f --- /dev/null +++ b/chart/elastalert2/templates/_names.tpl @@ -0,0 +1,60 @@ +{{/* vim: set filetype=mustache: */}} +{{/* +Expand the name of the chart. +*/}} +{{- define "common.names.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "common.names.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create a default fully qualified app name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +If release name contains chart name it will be used as a full name. +*/}} +{{- define "common.names.fullname" -}} +{{- if .Values.fullnameOverride -}} +{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}} +{{- else -}} +{{- $name := default .Chart.Name .Values.nameOverride -}} +{{- if contains $name .Release.Name -}} +{{- .Release.Name | trunc 63 | trimSuffix "-" -}} +{{- else -}} +{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}} +{{- end -}} +{{- end -}} +{{- end -}} + +{{- define "common.names.servicename" -}} +{{- $name := include "common.names.fullname" . | trunc 53 -}} +{{- printf "%s-%s" $name "metrics" -}} +{{- end -}} + +{{- define "common.names.configname" -}} +{{- $name := include "common.names.fullname" . | trunc 53 -}} +{{- printf "%s-%s" $name "config" -}} +{{- end -}} + +{{/* +Allow the release namespace to be overridden for multi-namespace deployments in combined charts. +*/}} +{{- define "common.names.namespace" -}} +{{- if .Values.namespaceOverride -}} +{{- .Values.namespaceOverride -}} +{{- else -}} +{{- .Release.Namespace -}} +{{- end -}} +{{- end -}} + +{{/* +Create a fully qualified app name adding the installation's namespace. +*/}} +{{- define "common.names.fullname.namespace" -}} +{{- printf "%s-%s" (include "common.names.fullname" .) (include "common.names.namespace" .) | trunc 63 | trimSuffix "-" -}} +{{- end -}} diff --git a/chart/elastalert2/templates/_tplvalues.tpl b/chart/elastalert2/templates/_tplvalues.tpl new file mode 100644 index 00000000..2db16685 --- /dev/null +++ b/chart/elastalert2/templates/_tplvalues.tpl @@ -0,0 +1,13 @@ +{{/* vim: set filetype=mustache: */}} +{{/* +Renders a value that contains template. +Usage: +{{ include "common.tplvalues.render" ( dict "value" .Values.path.to.the.Value "context" $) }} +*/}} +{{- define "common.tplvalues.render" -}} + {{- if typeIs "string" .value }} + {{- tpl .value .context }} + {{- else }} + {{- tpl (.value | toYaml) .context }} + {{- end }} +{{- end -}} diff --git a/chart/elastalert2/templates/config.yaml b/chart/elastalert2/templates/config.yaml index da2749df..ea17e975 100644 --- a/chart/elastalert2/templates/config.yaml +++ b/chart/elastalert2/templates/config.yaml @@ -2,12 +2,11 @@ apiVersion: v1 kind: ConfigMap metadata: - name: {{ template "elastalert.fullname" . }}-config - labels: - app: {{ template "elastalert.name" . }} - chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }} - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + name: {{ template "common.names.configname" . }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} data: elastalert_config: |- --- diff --git a/chart/elastalert2/templates/deployment.yaml b/chart/elastalert2/templates/deployment.yaml index 0e1c4a5f..e0fa2866 100644 --- a/chart/elastalert2/templates/deployment.yaml +++ b/chart/elastalert2/templates/deployment.yaml @@ -3,6 +3,9 @@ kind: Deployment metadata: name: {{ template "elastalert.fullname" . }} labels: + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} app: {{ template "elastalert.name" . }} chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" release: {{ .Release.Name }} @@ -10,8 +13,7 @@ metadata: spec: selector: matchLabels: - app: {{ template "elastalert.name" . }} - release: {{ .Release.Name }} + {{- include "common.labels.matchLabels" . | nindent 6 }} replicas: {{ .Values.replicaCount }} revisionHistoryLimit: {{ .Values.revisionHistoryLimit }} template: @@ -19,17 +21,18 @@ spec: annotations: checksum/config: {{ include (print $.Template.BasePath "/config.yaml") . | sha256sum }} checksum/rules: {{ include (print $.Template.BasePath "/rules.yaml") . | sha256sum }} -{{- if .Values.prometheusPort }} -{{ toYaml .Values.prometheusScrapeAnnotations | indent 8 }} - prometheus.io/port: {{ .Values.prometheusPort | quote}} +{{- if .Values.metrics.enabled }} +{{ toYaml .Values.metrics.prometheusScrapeAnnotations | indent 8 }} + prometheus.io/port: {{ .Values.metrics.prometheusPort | quote}} {{- end }} {{- if .Values.podAnnotations }} {{ toYaml .Values.podAnnotations | indent 8 }} {{- end }} - labels: - name: {{ template "elastalert.fullname" . }}-elastalert - app: {{ template "elastalert.name" . }} - release: {{ .Release.Name }} + labels: {{- include "common.labels.standard" . | nindent 8 }} + app.kubernetes.io/component: {{ .Values.appKubernetesIoComponent }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} spec: {{- if .Values.image.pullSecret }} imagePullSecrets: @@ -44,10 +47,11 @@ spec: - name: elastalert image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}" imagePullPolicy: {{ .Values.image.pullPolicy }} -{{- if .Values.prometheusPort }} +{{- if .Values.metrics.enabled }} ports: - - containerPort: {{ .Values.prometheusPort }} + - containerPort: {{ .Values.metrics.prometheusPort }} protocol: TCP + name: {{ .Values.metrics.prometheusPortName }} {{- end }} {{- if .Values.securityContext }} securityContext: @@ -58,13 +62,13 @@ spec: {{ toYaml .Values.command | indent 10 }} {{- end }} -{{- if or .Values.args .Values.prometheusPort }} +{{- if or .Values.args .Values.metrics.enabled }} args: {{- if .Values.args }} {{ toYaml .Values.args | indent 10 }} {{- end }} - {{- if .Values.prometheusPort }} - {{- $enableportlist := list "--prometheus_port" (.Values.prometheusPort | toString) }} + {{- if .Values.metrics.enabled }} + {{- $enableportlist := list "--prometheus_port" (.Values.metrics.prometheusPort | toString) }} {{ toYaml $enableportlist | indent 10 }} {{- end }} {{- end }} diff --git a/chart/elastalert2/templates/podsecuritypolicy.yaml b/chart/elastalert2/templates/podsecuritypolicy.yaml index e3777203..5f4d29fc 100644 --- a/chart/elastalert2/templates/podsecuritypolicy.yaml +++ b/chart/elastalert2/templates/podsecuritypolicy.yaml @@ -3,11 +3,10 @@ apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: {{ template "elastalert.fullname" . }} - labels: - app: {{ template "elastalert.name" . }} - chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} spec: # Prevents running in privileged mode privileged: false diff --git a/chart/elastalert2/templates/prometheusrule.yaml b/chart/elastalert2/templates/prometheusrule.yaml new file mode 100644 index 00000000..ba61e113 --- /dev/null +++ b/chart/elastalert2/templates/prometheusrule.yaml @@ -0,0 +1,19 @@ +{{- if and .Values.metrics.enabled .Values.metrics.prometheusRule.enabled }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ template "common.names.fullname" . }} + namespace: {{ default .Release.Namespace .Values.metrics.prometheusRule.namespace | quote }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.metrics.prometheusRule.additionalLabels }} + {{- include "common.tplvalues.render" (dict "value" .Values.metrics.prometheusRule.additionalLabels "context" $) | nindent 4 }} + {{- end }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} + {{- if .Values.commonAnnotations }} + annotations: {{- include "common.tplvalues.render" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }} + {{- end }} +spec: + {{- include "common.tplvalues.render" ( dict "value" .Values.metrics.prometheusRule.rules "context" $ ) | nindent 2 }} +{{- end }} diff --git a/chart/elastalert2/templates/role.yaml b/chart/elastalert2/templates/role.yaml index 93b9cadd..25b36df2 100644 --- a/chart/elastalert2/templates/role.yaml +++ b/chart/elastalert2/templates/role.yaml @@ -3,11 +3,10 @@ apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: {{ template "elastalert.fullname" . }} - labels: - app: {{ template "elastalert.name" . }} - chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} rules: - apiGroups: - policy diff --git a/chart/elastalert2/templates/rolebinding.yaml b/chart/elastalert2/templates/rolebinding.yaml index 67a69d1f..92a39e7a 100644 --- a/chart/elastalert2/templates/rolebinding.yaml +++ b/chart/elastalert2/templates/rolebinding.yaml @@ -3,11 +3,10 @@ apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: {{ template "elastalert.fullname" . }} - labels: - app: {{ template "elastalert.name" . }} - chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role diff --git a/chart/elastalert2/templates/rules.yaml b/chart/elastalert2/templates/rules.yaml index 1e4afd45..b00399fa 100644 --- a/chart/elastalert2/templates/rules.yaml +++ b/chart/elastalert2/templates/rules.yaml @@ -2,11 +2,10 @@ apiVersion: v1 kind: ConfigMap metadata: name: {{ template "elastalert.fullname" . }}-rules - labels: - app: {{ template "elastalert.name" . }} - chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }} - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} data: {{- range $key, $value := .Values.rules }} {{ $key | indent 2}}: |- diff --git a/chart/elastalert2/templates/service.yaml b/chart/elastalert2/templates/service.yaml new file mode 100644 index 00000000..98bce2ba --- /dev/null +++ b/chart/elastalert2/templates/service.yaml @@ -0,0 +1,43 @@ +{{- if .Values.metrics.enabled -}} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "common.names.servicename" . | quote }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + app.kubernetes.io/component: {{ .Values.appKubernetesIoComponent}} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} + annotations: + {{- if .Values.commonAnnotations }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }} + {{- end }} +spec: + type: {{ .Values.metrics.service.type }} + {{- if and .Values.metrics.service.clusterIP (eq .Values.metrics.service.type "ClusterIP") }} + clusterIP: {{ .Values.metrics.service.clusterIP }} + {{- end }} + {{- if ne .Values.metrics.service.type "ClusterIP" }} + externalTrafficPolicy: {{ .Values.metrics.service.externalTrafficPolicy }} + {{- end }} + {{- if and .Values.metrics.service.loadBalancerIP (eq .Values.metrics.service.type "LoadBalancer") }} + loadBalancerIP: {{ .Values.metrics.service.loadBalancerIP }} + {{- end }} + {{- if and (eq .Values.metrics.service.type "LoadBalancer") .Values.metrics.service.loadBalancerSourceRanges }} + loadBalancerSourceRanges: {{- toYaml .Values.metrics.service.loadBalancerSourceRanges | nindent 4 }} + {{- end }} + ports: + - port: {{ .Values.metrics.prometheusPort }} + targetPort: {{ .Values.metrics.prometheusPort }} + protocol: TCP + name: {{ .Values.metrics.prometheusPortName }} + {{- if and (or (eq .Values.metrics.service.type "NodePort") (eq .Values.metrics.service.type "LoadBalancer")) .Values.metrics.service.nodePorts }} + nodePort: {{ .Values.metrics.service.nodePorts }} + {{- else if eq .Values.metrics.service.type "ClusterIP" }} + nodePort: null + {{- end }} + + selector: + {{- include "common.labels.matchLabels" . | nindent 4 }} + app.kubernetes.io/component: {{ .Values.appKubernetesIoComponent }} +{{- end }} diff --git a/chart/elastalert2/templates/serviceaccount.yaml b/chart/elastalert2/templates/serviceaccount.yaml index dc1e08c5..e380f51c 100644 --- a/chart/elastalert2/templates/serviceaccount.yaml +++ b/chart/elastalert2/templates/serviceaccount.yaml @@ -3,11 +3,10 @@ apiVersion: v1 kind: ServiceAccount metadata: name: {{ include "elastalert.serviceAccountName" . }} - labels: - app: {{ template "elastalert.name" . }} - chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" - release: {{ .Release.Name }} - heritage: {{ .Release.Service }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} {{- with .Values.serviceAccount.annotations }} annotations: {{- toYaml . | nindent 4 }} diff --git a/chart/elastalert2/templates/servicemonitor.yaml b/chart/elastalert2/templates/servicemonitor.yaml new file mode 100644 index 00000000..387fedfe --- /dev/null +++ b/chart/elastalert2/templates/servicemonitor.yaml @@ -0,0 +1,45 @@ +{{- if and .Values.metrics.enabled .Values.metrics.serviceMonitor.enabled }} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "common.names.servicename" . | quote }} + namespace: {{ default .Release.Namespace .Values.metrics.serviceMonitor.namespace | quote }} + labels: {{- include "common.labels.standard" . | nindent 4 }} + app.kubernetes.io/component: {{ .Values.appKubernetesIoComponent}} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} + {{- if .Values.metrics.serviceMonitor.labels }} + {{- toYaml .Values.metrics.serviceMonitor.labels | nindent 4 }} + {{- end }} + {{- if .Values.commonAnnotations }} + annotations: {{- include "common.tplvalues.render" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }} + {{- end }} +spec: + {{- if .Values.metrics.serviceMonitor.jobLabel }} + jobLabel: {{ .Values.metrics.serviceMonitor.jobLabel }} + {{- end }} + endpoints: + - port: {{ .Values.metrics.prometheusPortName }} + {{- if .Values.metrics.serviceMonitor.interval }} + interval: {{ .Values.metrics.serviceMonitor.interval }} + {{- end }} + {{- if .Values.metrics.serviceMonitor.scrapeTimeout }} + scrapeTimeout: {{ .Values.metrics.serviceMonitor.scrapeTimeout }} + {{- end }} + {{- if .Values.metrics.serviceMonitor.metricRelabelings }} + metricRelabelings: {{ toYaml .Values.metrics.serviceMonitor.metricRelabelings | nindent 8 }} + {{- end }} + {{- if .Values.metrics.serviceMonitor.relabelings }} + relabelings: {{ toYaml .Values.metrics.serviceMonitor.relabelings | nindent 8 }} + {{- end }} + namespaceSelector: + matchNames: + - {{ .Release.Namespace | quote }} + selector: + matchLabels: {{- include "common.labels.matchLabels" . | nindent 6 }} + app.kubernetes.io/component: {{ .Values.appKubernetesIoComponent }} + {{- if .Values.metrics.serviceMonitor.selector }} + {{- include "common.tplvalues.render" (dict "value" .Values.metrics.serviceMonitor.selector "context" $) | nindent 6 }} + {{- end }} +{{- end }} diff --git a/chart/elastalert2/templates/smtp-auth.yaml b/chart/elastalert2/templates/smtp-auth.yaml index fb7d6561..da64fdcc 100644 --- a/chart/elastalert2/templates/smtp-auth.yaml +++ b/chart/elastalert2/templates/smtp-auth.yaml @@ -3,8 +3,10 @@ apiVersion: v1 kind: Secret metadata: name: elastalert-smtp-auth - labels: - app: elastalert2 + labels: {{- include "common.labels.standard" . | nindent 4 }} + {{- if .Values.commonLabels }} + {{- include "common.tplvalues.render" ( dict "value" .Values.commonLabels "context" $ ) | nindent 4 }} + {{- end }} type: kubernetes.io/Opaque stringData: smtp_auth.yaml: |- diff --git a/chart/elastalert2/values.yaml b/chart/elastalert2/values.yaml index 75e8d3f7..2f30f0e1 100644 --- a/chart/elastalert2/values.yaml +++ b/chart/elastalert2/values.yaml @@ -1,3 +1,11 @@ +## Chart information +nameOverride: "" +fullnameOverride: "" +namespaceOverride: "" +commonLabels: {} +commonAnnotations: {} +appKubernetesIoComponent: elastalert2 + # number of replicas to run replicaCount: 1 @@ -251,7 +259,138 @@ extraVolumeMounts: [] # subPath: smtp_auth.yaml # readOnly: true -# Prometheus Exporter defined by port: -prometheusScrapeAnnotations: - prometheus.io/scrape: "true" - prometheus.io/path: "/" + +## @section Metrics parameters + +## Prometheus metrics +## +metrics: + ## @param metrics.enabled Enable the export of Prometheus metrics + ## + enabled: false + prometheusPort: 8080 + prometheusPortName: http-alt + # Prometheus Exporter defined by port: + prometheusScrapeAnnotations: + prometheus.io/scrape: "true" + prometheus.io/path: "/" + + service: + type: ClusterIP + # clusterIP: "" + # externalTrafficPolicy: Cluster + # loadBalancerIP: "" + # loadBalancerSourceRanges: {} + # nodePorts: "" + + ## Prometheus Operator ServiceMonitor configuration + ## + serviceMonitor: + ## @param metrics.serviceMonitor.enabled Specify if a ServiceMonitor will be deployed for Prometheus Operator + ## + enabled: false + + ## @param metrics.serviceMonitor.namespace Namespace in which Prometheus is running + ## + namespace: "" + + ## @param metrics.serviceMonitor.labels Extra labels for the ServiceMonitor + ## Normally used for prometheus operator to detect the servicemonitor if deployed to different namespace + ## labels: + ## release: prometheus-operator + labels: {} + + ## @param metrics.serviceMonitor.jobLabel The name of the label on the target service to use as the job name in Prometheus + ## + jobLabel: "" + + ## @param metrics.serviceMonitor.interval How frequently to scrape metrics + ## e.g: + ## interval: 10s + ## + interval: "" + ## @param metrics.serviceMonitor.scrapeTimeout Timeout after which the scrape is ended + ## e.g: + ## scrapeTimeout: 10s + ## + scrapeTimeout: "" + ## @param metrics.serviceMonitor.metricRelabelings [array] Specify additional relabeling of metrics + ## metricRelabelings: + ## # Drop GO metrics + ## - sourceLabels: [__name__] + ## regex: go_.* + ## action: drop + ## # Drop python_gc metrics + ## - sourceLabels: [__name__] + ## regex: python_gc.* + ## action: drop + ## # Normalise POD names + ## - sourceLabels: [pod] + ## regex: (.+elastalert2)\-([\w\d]+)\-([\w\d]+) + ## replacement: $1 + ## targetLabel: pod + metricRelabelings: [] + + ## @param metrics.serviceMonitor.relabelings [array] Specify general relabeling + ## + relabelings: [] + ## @param metrics.serviceMonitor.selector Prometheus instance selector labels + ## ref: https://github.com/bitnami/charts/tree/master/bitnami/prometheus-operator#prometheus-configuration + ## + selector: {} + + ## PrometheusRule CRD configuration + ## + prometheusRule: + ## @param metrics.prometheusRule.enabled If `true`, creates a Prometheus Operator PrometheusRule (also requires `metrics.enabled` to be `true`) + ## + enabled: false + ## @param metrics.prometheusRule.namespace Namespace in which the PrometheusRule CRD is created + ## + namespace: "" + + ## @param metrics.prometheusRule.additionalLabels Additional labels for the prometheusRule + ## to be detected by prometheus-operator + ## additionalLabels: + ## release: prometheus-operator + additionalLabels: {} + + ## @param metrics.prometheusRule.rules Prometheus Rules for Thanos components + ## These are just examples rules, please adapt them to your needs. + ## rules: |- + ## groups: + ## - name: elastalert + ## rules: + ## - alert: elastalert Pod down + ## annotations: + ## description: Prometheus is unable to scrape metrics service. Check pod logs for details + ## summary: elastalert POD is down + ## expr: up{service="{{ template "common.names.servicename" . }}",container="elastalert"} == 0 + ## for: 5m + ## labels: + ## severity: critical + ## production: 'True' + ## - alert: elastalert file descriptors use + ## annotations: + ## description: Elastalert pod nearly exhausting file descriptors + ## summary: too many file descriptors used + ## expr: |- + ## process_open_fds{service="{{ template "common.names.servicename" . }}",container="elastalert"} + ## / + ## process_max_fds{service="{{ template "common.names.servicename" . }}",container="elastalert"} + ## > 0.9 + ## for: 3m + ## labels: + ## severity: critical + ## production: 'True' + ## - alert: elastalert scrapes failing + ## annotations: + ## description: Elastalert is not scraping for a rule {{ "{{" }} $labels.rule_name {{ "}}" }} + ## summary: scrapes for rule stalled {{ "{{" }} $labels.rule_name {{ "}}" }} + ## expr: |- + ## rate(elastalert_scrapes_total{service="{{ template "common.names.servicename" . }}",container="elastalert"}[1m]) == 0 + ## for: 5m + ## labels: + ## severity: critical + ## production: 'True' + rules: [] From dcbdea293fcca128e78a5aaec696d5883728b2c4 Mon Sep 17 00:00:00 2001 From: Pedro Santos Date: Mon, 11 Jul 2022 09:10:35 +0000 Subject: [PATCH 2/6] add PR information to changelog --- CHANGELOG.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 782ba690..1a8786d9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,10 +3,10 @@ ## Breaking changes - When using HTTP POST 2, it is no longer necessary to pre-escape strings (should they contain control chars) from events in elastic search which are replaced by the jinja2 template. -- [Kubernetes] [Breaking] reconfigure metrics to follow prometheus operator nomenclature. `metrics` value, now control the addition of metrics endpoint (command argument), the creation of a service to expose the metrics endpoint and the (optional) creation of prometheus-operator objects: serviceMonitor and prometheurRules to match implementations of other charts. The labels of the chart have been modified, so you'll need to uninstall and reinstall the chart for the upgrade to work. - @PedroMSantosD +- [Kubernetes] [Breaking] reconfigure metrics to follow prometheus operator nomenclature. `metrics` value, now control the addition of metrics endpoint (command argument), the creation of a service to expose the metrics endpoint and the (optional) creation of prometheus-operator objects: serviceMonitor and prometheurRules to match implementations of other charts. The labels of the chart have been modified, so you'll need to uninstall and reinstall the chart for the upgrade to work. - [#902](https://github.com/jertel/elastalert2/pull/902) - @PedroMSantosD ## New features -- [Kubernetes] Chart is now able to create a service for the metrics, and optional prometheus-operator custom resources serviceMonitor and prometheusRule. +- [Kubernetes] Chart is now able to create a service for the metrics, and optional prometheus-operator custom resources serviceMonitor and prometheusRule. - [#902](https://github.com/jertel/elastalert2/pull/902) - @PedroMSantosD ## Other changes - Upgrade pylint 2.13.8 to 2.14.3, Upgrade sphinx 4.5.0 to 5.0.2 - [#891](https://github.com/jertel/elastalert2/pull/891) - @nsano-rururu From 5f9e97e515c4d843edc1766617b6d47b3e864513 Mon Sep 17 00:00:00 2001 From: Pedro Santos Date: Tue, 12 Jul 2022 08:08:05 +0000 Subject: [PATCH 3/6] update readme --- chart/elastalert2/README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/chart/elastalert2/README.md b/chart/elastalert2/README.md index 8fc3ee5d..35ebf64b 100644 --- a/chart/elastalert2/README.md +++ b/chart/elastalert2/README.md @@ -97,12 +97,12 @@ The command removes all the Kubernetes components associated with the chart and | `smtp_auth.username` | Optional SMTP mail server username. If the value is not empty, the smtp_auth secret will be created automatically. | `NULL` | | `smtp_auth.password` | Optional SMTP mail server passwpord. This must be specified if the above field, `smtp_auth.username` is also specified. | `NULL` | | `metrics` | enable elastalert prometheus endpoint, add prometheus.io annotations to pod and create a service pointing to the port for prometheus to scrape the metrics | `false` | -| `metrics.prometheusPort` | if "metrics" is set to true, CP port pod will expose prometheus metrics on. | `8080` | +| `metrics.prometheusPort` | if "metrics" is set to true, prometheus metrics will be exposed by the pod on this port. | `8080` | | `metrics.prometheusPortName` | name of the port where metrics are exposed | `http-alt` | -| `metrics.prometheusScrapeAnnotations` | if metrics are enabled, annotations to add to the pod for prometheus configuration. prometheus.io/port is also added uring the prometheusPort and prometheusPortName values | {prometheus.io/scrape: "true" prometheus.io/path: "/"} | -| `metrics.serviceMonitor.enabled` | If metrics are enabled, create a servicemonitor custom resource for prometheus-operator to detect and monitor the service with the merics endpoint | `false` | -| `metrics.serviceMonitor.labels` | labels to add to the serviceMonitor object for prometheus-operator to detect and append it to your prometheus configuration, when deployed on a different namespas as the prometheus operator | `{}` | -| `metrics.serviceMonitor.metricRelabelings` | list of prometheus metric relabeling configs to aply to scrape. Example@ drop python_gc metrics or alter pod name | `[]` | -| `metrics.prometheusRule.enabled` | If metrics are enabled, create a prometheusRule custom resource for prometheus-operator to customise scrape configuration | `false` | -| `metrics.prometheusRule.additionalLabels` | labels to add to the prometheusRule object for prometheus-operator to detect and append it to your prometheus configuration, when deployed on a different namespas as the prometheus operator | `{}` | -| `metrics.prometheusRule.rules` | group of rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Define as multiline Yaml string | `[]` | \ No newline at end of file +| `metrics.prometheusScrapeAnnotations` | if metrics are enabled, annotations to add to the pod for prometheus configuration. prometheus.io/port is also added uring the prometheusPort and prometheusPortName values | `{prometheus.io/scrape: "true" prometheus.io/path: "/"}` | +| `metrics.serviceMonitor.enabled` | If metrics are enabled, create a servicemonitor custom resource for prometheus-operator to detect and configure the merics endpoint on prometheus. | `false` | +| `metrics.serviceMonitor.labels` | labels to add to the serviceMonitor object for prometheus-operator to detect, when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | +| `metrics.serviceMonitor.metricRelabelings` | list of prometheus metric relabeling configs to aply to scrape. Example: drop python_gc metrics or alter pod name | `[]` | +| `metrics.prometheusRule.enabled` | If metrics are enabled, create a prometheusRule custom resource for prometheus-operator | `false` | +| `metrics.prometheusRule.additionalLabels` | labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | +| `metrics.prometheusRule.rules` | group of alerting and/or recording rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Should be added as multiline Yaml string | `` | From e4ce60ee876ffeefc1d1d04bf2d09bbf8eaaf8d8 Mon Sep 17 00:00:00 2001 From: Pedro Santos Date: Wed, 13 Jul 2022 07:43:44 +0000 Subject: [PATCH 4/6] Fix README typos. --- chart/elastalert2/README.md | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/chart/elastalert2/README.md b/chart/elastalert2/README.md index 35ebf64b..6f452c7e 100644 --- a/chart/elastalert2/README.md +++ b/chart/elastalert2/README.md @@ -3,6 +3,8 @@ An ElastAlert 2 helm chart is available, and can be installed into an existing Kubernetes cluster by following the instructions below. +Inspiration for optional serviceMonitor and prometheusRules objects, along with source code for calculating and implementing labels on the chart, ported from https://github.com/bitnami/charts/tree/master/bitnami/thanos/templates + ## Installing the Chart Add the elastalert2 repository to your Helm configuration: @@ -83,7 +85,7 @@ The command removes all the Kubernetes components associated with the chart and | `serviceAccount.annotations` | ServiceAccount annotations | | | `podSecurityPolicy.create` | [DEPRECATED] Create pod security policy resources | `false` | | `resources` | Container resource requests and limits | {} | -| `rulesVolumeName` | Specifies the rules volume to be mounted. Can be changed for mounting a custom rules folder via the extraVolumes parameter, instead of using the default rules configMap or secret rule mounting method. | "rules" | +| `rulesVolumeName` | Specifies the rules volume to be mounted. Can be changed for mounting a custom rules folder via the extraVolumes parameter, instead of using the default rules configMap or secret rule mounting metrics. | "rules" | | `rules` | Rule and alert configuration for ElastAlert 2 | {} example shown in values.yaml | | `runIntervalMins` | Default interval between alert checks, in minutes | 1 | | `realertIntervalMins` | Time between alarms for same rule, in minutes | `NULL` | @@ -96,13 +98,13 @@ The command removes all the Kubernetes components associated with the chart and | `tolerations` | Tolerations for deployment | [] | | `smtp_auth.username` | Optional SMTP mail server username. If the value is not empty, the smtp_auth secret will be created automatically. | `NULL` | | `smtp_auth.password` | Optional SMTP mail server passwpord. This must be specified if the above field, `smtp_auth.username` is also specified. | `NULL` | -| `metrics` | enable elastalert prometheus endpoint, add prometheus.io annotations to pod and create a service pointing to the port for prometheus to scrape the metrics | `false` | -| `metrics.prometheusPort` | if "metrics" is set to true, prometheus metrics will be exposed by the pod on this port. | `8080` | -| `metrics.prometheusPortName` | name of the port where metrics are exposed | `http-alt` | -| `metrics.prometheusScrapeAnnotations` | if metrics are enabled, annotations to add to the pod for prometheus configuration. prometheus.io/port is also added uring the prometheusPort and prometheusPortName values | `{prometheus.io/scrape: "true" prometheus.io/path: "/"}` | -| `metrics.serviceMonitor.enabled` | If metrics are enabled, create a servicemonitor custom resource for prometheus-operator to detect and configure the merics endpoint on prometheus. | `false` | -| `metrics.serviceMonitor.labels` | labels to add to the serviceMonitor object for prometheus-operator to detect, when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | -| `metrics.serviceMonitor.metricRelabelings` | list of prometheus metric relabeling configs to aply to scrape. Example: drop python_gc metrics or alter pod name | `[]` | +| `metrics` | Enable elastalert prometheus endpoint, add prometheus.io annotations to pod and create a service pointing to the port for prometheus to scrape the metrics | `false` | +| `metrics.prometheusPort` | If "metrics" is set to true, prometheus metrics will be exposed by the pod on this port. | `8080` | +| `metrics.prometheusPortName` | Name of the port where metrics are exposed | `http-alt` | +| `metrics.prometheusScrapeAnnotations` | If metrics are enabled, annotations to add to the pod for prometheus configuration. prometheus.io/port is also added during the prometheusPort and prometheusPortName values | `{prometheus.io/scrape: "true" prometheus.io/path: "/"}` | +| `metrics.serviceMonitor.enabled` | If metrics are enabled, create a serviceMonitor custom resource for prometheus-operator to detect and configure the metrics endpoint on prometheus. | `false` | +| `metrics.serviceMonitor.labels` | Labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | +| `metrics.serviceMonitor.metricRelabelings` | List of prometheus metric relabeling configs to apply to scrape. Example: drop python_gc metrics or alter pod name | `[]` | | `metrics.prometheusRule.enabled` | If metrics are enabled, create a prometheusRule custom resource for prometheus-operator | `false` | -| `metrics.prometheusRule.additionalLabels` | labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | -| `metrics.prometheusRule.rules` | group of alerting and/or recording rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Should be added as multiline Yaml string | `` | +| `metrics.prometheusRule.additionalLabels` | Labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | +| `metrics.prometheusRule.rules` | Group of alerting and/or recording rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Should be added as multiline Yaml string | `` | From 71a41f14183c2517e699f72209a335fe8afb5767 Mon Sep 17 00:00:00 2001 From: Pedro Santos Date: Wed, 13 Jul 2022 07:47:51 +0000 Subject: [PATCH 5/6] Fix CHANGELOG typos. --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1a8786d9..a390c072 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,7 +3,7 @@ ## Breaking changes - When using HTTP POST 2, it is no longer necessary to pre-escape strings (should they contain control chars) from events in elastic search which are replaced by the jinja2 template. -- [Kubernetes] [Breaking] reconfigure metrics to follow prometheus operator nomenclature. `metrics` value, now control the addition of metrics endpoint (command argument), the creation of a service to expose the metrics endpoint and the (optional) creation of prometheus-operator objects: serviceMonitor and prometheurRules to match implementations of other charts. The labels of the chart have been modified, so you'll need to uninstall and reinstall the chart for the upgrade to work. - [#902](https://github.com/jertel/elastalert2/pull/902) - @PedroMSantosD +- [Kubernetes] [Breaking] Reconfigure metrics to follow prometheus operator nomenclature. `metrics` value, now control the addition of metrics endpoint (command argument), the creation of a service to expose the metrics endpoint and the (optional) creation of prometheus-operator objects: serviceMonitor and prometheurRules to match implementations of other charts. The labels of the chart have been modified, so you'll need to uninstall and reinstall the chart for the upgrade to work. - [#902](https://github.com/jertel/elastalert2/pull/902) - @PedroMSantosD ## New features - [Kubernetes] Chart is now able to create a service for the metrics, and optional prometheus-operator custom resources serviceMonitor and prometheusRule. - [#902](https://github.com/jertel/elastalert2/pull/902) - @PedroMSantosD From a5807f68b015fbf89f3a080041f1dd6bf5a43594 Mon Sep 17 00:00:00 2001 From: Jason Ertel Date: Wed, 13 Jul 2022 07:28:29 -0400 Subject: [PATCH 6/6] Update README.md --- chart/elastalert2/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/chart/elastalert2/README.md b/chart/elastalert2/README.md index 6f452c7e..1982b64f 100644 --- a/chart/elastalert2/README.md +++ b/chart/elastalert2/README.md @@ -85,7 +85,7 @@ The command removes all the Kubernetes components associated with the chart and | `serviceAccount.annotations` | ServiceAccount annotations | | | `podSecurityPolicy.create` | [DEPRECATED] Create pod security policy resources | `false` | | `resources` | Container resource requests and limits | {} | -| `rulesVolumeName` | Specifies the rules volume to be mounted. Can be changed for mounting a custom rules folder via the extraVolumes parameter, instead of using the default rules configMap or secret rule mounting metrics. | "rules" | +| `rulesVolumeName` | Specifies the rules volume to be mounted. Can be changed for mounting a custom rules folder via the extraVolumes parameter, instead of using the default rules configMap or secret rule mounting method. | "rules" | | `rules` | Rule and alert configuration for ElastAlert 2 | {} example shown in values.yaml | | `runIntervalMins` | Default interval between alert checks, in minutes | 1 | | `realertIntervalMins` | Time between alarms for same rule, in minutes | `NULL` | @@ -106,5 +106,5 @@ The command removes all the Kubernetes components associated with the chart and | `metrics.serviceMonitor.labels` | Labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | | `metrics.serviceMonitor.metricRelabelings` | List of prometheus metric relabeling configs to apply to scrape. Example: drop python_gc metrics or alter pod name | `[]` | | `metrics.prometheusRule.enabled` | If metrics are enabled, create a prometheusRule custom resource for prometheus-operator | `false` | -| `metrics.prometheusRule.additionalLabels` | Labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | +| `metrics.prometheusRule.additionalLabels` | Labels to add to the prometheusRule object for prometheus-operator to detect it, when deployed on a namespace different from the one where prometheus-operator is running. | `{}` | | `metrics.prometheusRule.rules` | Group of alerting and/or recording rules to add to the prometheus configuration, example Alerting rules for pod down, or for file descriptors. Should be added as multiline Yaml string | `` |