Skip to content

Commit

Permalink
Datadog Integration (#3407)
Browse files Browse the repository at this point in the history
* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes

* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push

* changelog entry update

* datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config

* curt pr review changes (minus extraConfig templating verification changes)

* global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics

* dogstatsd and otlp mutually exclusive verification checks

* breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck

* extraConfig hash updates post merge conflict update

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update changelog .txt to match new PR number

* updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides)

* update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior

* fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1

* add in server-statefulset bats test for extraConfig validation testing
  • Loading branch information
natemollica-nm committed Feb 12, 2024
1 parent 499f4bd commit b8315a8
Show file tree
Hide file tree
Showing 17 changed files with 1,384 additions and 32 deletions.
13 changes: 13 additions & 0 deletions .changelog/3407.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
```release-note:feature
helm: introduces `global.metrics.datadog` overrides to streamline consul-k8s datadog integration.
helm: introduces `server.enableAgentDebug` to expose agent [`enable_debug`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#enable_debug) configuration.
helm: introduces `global.metrics.disableAgentHostName` to expose agent [`telemetry.disable_hostname`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-disable_hostname) configuration.
helm: introduces `global.metrics.enableHostMetrics` to expose agent [`telemetry.enable_host_metrics`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-enable_host_metrics) configuration.
helm: introduces `global.metrics.prefixFilter` to expose agent [`telemetry.prefix_filter`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-prefix_filter) configuration.
helm: introduces `global.metrics.datadog.dogstatsd.dogstatsdAddr` to expose agent [`telemetry.dogstatsd_addr`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-dogstatsd_addr) configuration.
helm: introduces `global.metrics.datadog.dogstatsd.dogstatsdTags` to expose agent [`telemetry.dogstatsd_tags`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-dogstatsd_tags) configuration.
helm: introduces required `ad.datadoghq.com/` annotations and `tags.datadoghq.com/` labels for integration with [Datadog Autodiscovery](https://docs.datadoghq.com/integrations/consul/?tab=containerized) and [Datadog Unified Service Tagging](https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/?tab=kubernetes#serverless-environment) for Consul.
helm: introduces automated unix domain socket hostPath mounting for containerized integration with datadog within consul-server statefulset.
helm: introduces `global.metrics.datadog.otlp` override options to allow OTLP metrics forwarding to Datadog Agent.
control-plane: adds `server-acl-init` datadog agent token creation for datadog integration.
```
203 changes: 185 additions & 18 deletions charts/consul/templates/_helpers.tpl

Large diffs are not rendered by default.

38 changes: 38 additions & 0 deletions charts/consul/templates/datadog-agent-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{{- if .Values.global.metrics.datadog.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ template "consul.fullname" . }}-datadog-metrics
namespace: {{ .Release.Namespace }}
labels:
app: datadog
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
component: agent
{{- if (or (and .Values.global.openshift.enabled .Values.server.exposeGossipAndRPCPorts) .Values.global.enablePodSecurityPolicies) }}
{{- if .Values.global.enablePodSecurityPolicies }}
rules:
- apiGroups: ["policy"]
resources: ["podsecuritypolicies"]
resourceNames:
- {{ template "consul.fullname" . }}-datadog-metrics
verbs:
- use
{{- end }}
{{- if (and .Values.global.openshift.enabled .Values.server.exposeGossipAndRPCPorts ) }}
- apiGroups: ["security.openshift.io"]
resources: ["securitycontextconstraints"]
resourceNames:
- {{ template "consul.fullname" . }}-datadog-metrics
verbs:
- use
{{- end }}
{{- else}}
rules:
- apiGroups: [ "" ]
resources: [ "secrets" ]
resourceNames:
- {{ .Release.Namespace }}-datadog-agent-metrics-acl-token
verbs: [ "get", "watch", "list" ]
{{- end }}
{{- end }}
26 changes: 26 additions & 0 deletions charts/consul/templates/datadog-agent-rolebinding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{{- if .Values.global.metrics.datadog.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ template "consul.fullname" . }}-datadog-metrics
namespace: {{ .Release.Namespace }}
labels:
app: {{ template "consul.name" . }}
chart: {{ template "consul.chart" . }}
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
component: agent
subjects:
- kind: ServiceAccount
apiGroup: ""
name: datadog-agent
namespace: datadog
- kind: ServiceAccount
apiGroup: ""
name: datadog-cluster-agent
namespace: datadog
roleRef:
kind: Role
name: {{ template "consul.fullname" . }}-datadog-metrics
apiGroup: ""
{{- end }}
4 changes: 4 additions & 0 deletions charts/consul/templates/server-acl-init-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,10 @@ spec:
-create-enterprise-license-token=true \
{{- end }}
{{- if (and (not .Values.global.metrics.datadog.dogstatsd.enabled) .Values.global.metrics.datadog.enabled .Values.global.acls.manageSystemACLs) }}
-create-dd-agent-token=true \
{{- end }}
{{- if .Values.server.snapshotAgent.enabled }}
-snapshot-agent=true \
{{- end }}
Expand Down
9 changes: 8 additions & 1 deletion charts/consul/templates/server-config-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ data:
{{- if .Values.server.logLevel }}
"log_level": "{{ .Values.server.logLevel | upper }}",
{{- end }}
"enable_debug": {{ .Values.server.enableAgentDebug }},
"domain": "{{ .Values.global.domain }}",
"limits": {
"request_limits": {
Expand Down Expand Up @@ -187,7 +188,13 @@ data:
telemetry-config.json: |-
{
"telemetry": {
"prometheus_retention_time": "{{ .Values.global.metrics.agentMetricsRetentionTime }}"
"prometheus_retention_time": "{{ .Values.global.metrics.agentMetricsRetentionTime }}",
"disable_hostname": {{ .Values.global.metrics.disableAgentHostName }},{{ template "consul.prefixFilter" . }}
"enable_host_metrics": {{ .Values.global.metrics.enableHostMetrics }}{{- if .Values.global.metrics.datadog.dogstatsd.enabled }},{{ template "consul.dogstatsdAaddressInfo" . }}
{{- if .Values.global.metrics.datadog.dogstatsd.enabled }}
"dogstatsd_tags": {{ .Values.global.metrics.datadog.dogstatsd.dogstatsdTags | toJson }}
{{- end }}
{{- end }}
}
}
{{- end }}
Expand Down
83 changes: 82 additions & 1 deletion charts/consul/templates/server-statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@
{{- end -}}
{{ template "consul.validateRequiredCloudSecretsExist" . }}
{{ template "consul.validateCloudSecretKeys" . }}
{{ template "consul.validateMetricsConfig" . }}
{{ template "consul.validateDatadogConfiguration" . }}
{{ template "consul.validateExtraConfig" . }}
# StatefulSet to run the actual Consul server cluster.
apiVersion: apps/v1
kind: StatefulSet
Expand Down Expand Up @@ -62,6 +65,11 @@ spec:
release: {{ .Release.Name }}
component: server
hasDNS: "true"
{{- if .Values.global.metrics.datadog.enabled }}
"tags.datadoghq.com/version": {{ template "consul.versionInfo" . }}
"tags.datadoghq.com/env": {{ template "consul.name" . }}
"tags.datadoghq.com/service": "consul-server"
{{- end }}
{{- if .Values.server.extraLabels }}
{{- toYaml .Values.server.extraLabels | nindent 8 }}
{{- end }}
Expand Down Expand Up @@ -124,6 +132,7 @@ spec:
{{- tpl .Values.server.annotations . | nindent 8 }}
{{- end }}
{{- if (and .Values.global.metrics.enabled .Values.global.metrics.enableAgentMetrics) }}
{{- if not .Values.global.metrics.datadog.openMetricsPrometheus.enabled }}
"prometheus.io/scrape": "true"
"prometheus.io/path": "/v1/agent/metrics"
{{- if .Values.global.tls.enabled }}
Expand All @@ -134,6 +143,67 @@ spec:
"prometheus.io/scheme": "http"
{{- end }}
{{- end }}
{{- if .Values.global.metrics.datadog.enabled }}
"ad.datadoghq.com/tolerate-unready": "true"
"ad.datadoghq.com/consul.logs": {{ .Values.global.metrics.datadog.dogstatsd.dogstatsdTags | toJson | replace "[" "[{" | replace "]" "}]" | replace ":" "\": \"" | join "\",\"" | squote }}
{{- if .Values.global.metrics.datadog.openMetricsPrometheus.enabled }}
"ad.datadoghq.com/consul.checks": |
{
"openmetrics": {
"init_config": {},
"instances": [
{
{{- if .Values.global.tls.enabled }}
"openmetrics_endpoint": "https://consul-server.{{ .Release.Namespace }}.svc:8501/v1/agent/metrics?format=prometheus",
"tls_cert": "/etc/datadog-agent/conf.d/consul.d/certs/tls.crt",
"tls_private_key": "/etc/datadog-agent/conf.d/consul.d/certs/tls.key",
"tls_ca_cert": "/etc/datadog-agent/conf.d/consul.d/ca/tls.crt",
{{- else }}
"openmetrics_endpoint": "http://consul-server.{{ .Release.Namespace }}.svc:8500/v1/agent/metrics?format=prometheus",
{{- end }}
{{- if ( .Values.global.acls.manageSystemACLs) }}
"headers": {
"X-Consul-Token": "ENC[k8s_secret@{{ .Release.Namespace }}/{{ .Release.Namespace }}-datadog-agent-metrics-acl-token/token]"
},
{{- end }}
"namespace": "{{ .Release.Namespace }}",
"metrics": [ ".*" ]
}
]
}
}
{{- else if (not .Values.global.metrics.datadog.dogstatsd.enabled) }}
"ad.datadoghq.com/consul.checks": |
{
"consul": {
"init_config": {},
"instances": [
{
{{- if .Values.global.tls.enabled }}
"url": "https://consul-server.{{ .Release.Namespace }}.svc:8501",
"tls_cert": "/etc/datadog-agent/conf.d/consul.d/certs/tls.crt",
"tls_private_key": "/etc/datadog-agent/conf.d/consul.d/certs/tls.key",
"tls_ca_cert": "/etc/datadog-agent/conf.d/consul.d/ca/tls.crt",
{{- else }}
"url": "http://consul-server.consul.svc:8500",
{{- end }}
"use_prometheus_endpoint": true,
{{- if ( .Values.global.acls.manageSystemACLs) }}
"acl_token": "ENC[k8s_secret@{{ .Release.Namespace }}/{{ .Release.Namespace }}-datadog-agent-metrics-acl-token/token]",
{{- end }}
"new_leader_checks": true,
"network_latency_checks": true,
"catalog_checks": true,
"auth_type": "basic"
}
]
}
}
{{- else }}
"ad.datadoghq.com/consul.metrics_exclude": "true"
{{- end }}
{{- end }}
{{- end }}
spec:
{{- if .Values.server.affinity }}
affinity:
Expand Down Expand Up @@ -219,6 +289,12 @@ spec:
emptyDir:
medium: "Memory"
{{- end }}
{{- if and .Values.global.metrics.datadog.enabled .Values.global.metrics.datadog.dogstatsd.enabled (eq .Values.global.metrics.datadog.dogstatsd.socketTransportType "UDS" ) }}
- name: dsdsocket
hostPath:
path: /var/run/datadog
type: DirectoryOrCreate
{{- end }}
{{- range .Values.server.extraVolumes }}
- name: userconfig-{{ .name }}
{{ .type }}:
Expand Down Expand Up @@ -257,7 +333,7 @@ spec:
{{- include "consul.restrictedSecurityContext" . | nindent 8 }}
containers:
- name: consul
image: "{{ default .Values.global.image .Values.server.image }}"
image: "{{ default .Values.global.image .Values.server.image | trimPrefix "\"" | trimSuffix "\"" }}"
imagePullPolicy: {{ .Values.global.imagePullPolicy }}
env:
- name: ADVERTISE_IP
Expand Down Expand Up @@ -455,6 +531,11 @@ spec:
mountPath: /consul/license
readOnly: true
{{- end }}
{{- if and .Values.global.metrics.datadog.enabled .Values.global.metrics.datadog.dogstatsd.enabled (eq .Values.global.metrics.datadog.dogstatsd.socketTransportType "UDS" ) }}
- name: dsdsocket
mountPath: /var/run/datadog
readOnly: true
{{- end }}
{{- range .Values.server.extraVolumes }}
- name: userconfig-{{ .name }}
readOnly: true
Expand Down
13 changes: 13 additions & 0 deletions charts/consul/templates/telemetry-collector-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,19 @@ spec:
- name: SSL_CERT_DIR
value: "/etc/ssl/certs:/trusted-cas"
{{- end }}
{{- if .Values.global.metrics.datadog.otlp.enabled }}
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
{{- if eq (.Values.global.metrics.datadog.otlp.protocol | lower ) "http" }}
- name: CO_OTEL_HTTP_ENDPOINT
value: "http://$(HOST_IP):4318"
{{- else if eq (.Values.global.metrics.datadog.otlp.protocol | lower) "grpc" }}
- name: CO_OTEL_HTTP_ENDPOINT
value: "grpc://$(HOST_IP):4317"
{{- end }}
{{- end }}
{{- include "consul.extraEnvironmentVars" .Values.telemetryCollector | nindent 12 }}
command:
- "/bin/sh"
Expand Down
49 changes: 49 additions & 0 deletions charts/consul/test/unit/server-acl-init-job.bats
Original file line number Diff line number Diff line change
Expand Up @@ -2444,3 +2444,52 @@ load _helpers
yq 'any(contains("-enable-resource-apis=true"))' | tee /dev/stderr)
[ "${actual}" = "true" ]
}

#--------------------------------------------------------------------
# global.metrics.datadog

@test "serverACLInit/Job: -create-dd-agent-token not set when datadog=false and manageSystemACLs=true" {
cd `chart_dir`
local command=$(helm template \
-s templates/server-acl-init-job.yaml \
--set 'global.acls.manageSystemACLs=true' \
. | tee /dev/stderr |
yq '.spec.template.spec.containers[0].command' | tee /dev/stderr)

local actual=$( echo "$command" |
yq 'any(contains("-create-dd-agent-token"))' | tee /dev/stderr)
[ "${actual}" = "false" ]
}

@test "serverACLInit/Job: -create-dd-agent-token set when global.metrics.datadog=true and global.acls.manageSystemACLs=true" {
cd `chart_dir`
local command=$(helm template \
-s templates/server-acl-init-job.yaml \
--set 'global.metrics.enabled=true' \
--set 'global.metrics.enableAgentMetrics=true' \
--set 'global.metrics.datadog.enabled=true' \
--set 'global.acls.manageSystemACLs=true' \
. | tee /dev/stderr |
yq '.spec.template.spec.containers[0].command' | tee /dev/stderr)

local actual=$( echo "$command" |
yq 'any(contains("-create-dd-agent-token"))' | tee /dev/stderr)
[ "${actual}" = "true" ]
}

@test "serverACLInit/Job: -create-dd-agent-token NOT set when global.metrics.datadog=true, global.metrics.datadog.dogstatsd.enabled=true, and global.acls.manageSystemACLs=true" {
cd `chart_dir`
local command=$(helm template \
-s templates/server-acl-init-job.yaml \
--set 'global.metrics.enabled=true' \
--set 'global.metrics.enableAgentMetrics=true' \
--set 'global.metrics.datadog.enabled=true' \
--set 'global.metrics.datadog.dogstatsd.enabled=true' \
--set 'global.acls.manageSystemACLs=true' \
. | tee /dev/stderr |
yq '.spec.template.spec.containers[0].command' | tee /dev/stderr)

local actual=$( echo "$command" |
yq 'any(contains("-create-dd-agent-token"))' | tee /dev/stderr)
[ "${actual}" = "false" ]
}
Loading

0 comments on commit b8315a8

Please sign in to comment.