Strimzi Drain Cleaner is a utility which helps with moving the Apache Kafka® pods deployed by Strimzi from Kubernetes nodes which are being drained. It is useful if you want the Strimzi operator to move the pods instead of Kubernetes itself. The advantage of this approach is that the Strimzi operator makes sure that no partition replicas become under-replicated during the node draining. To use it:
- Configure your Kafka topics to have replication factor higher than 1 and make sure the
min.insync.replicas
is always set to a number lower than the replication factor. Availability of topics with replication factor1
or withmin.insync.replicas
set to the same value as the replication factor will be always affected when the brokers are restarted. - Deploy Kafka using Strimzi and configure the
PodDisruptionBudgets
for Kafka and ZooKeeper to havemaxUnavailable
set to0
. This will block Kubernetes from moving the pods on their own.
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
replicas: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 100Gi
deleteClaim: false
template:
podDisruptionBudget:
maxUnavailable: 0
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 100Gi
deleteClaim: false
template:
podDisruptionBudget:
maxUnavailable: 0
entityOperator:
topicOperator: {}
userOperator: {}
- Deploy the Strimzi Drain Cleaner
- Drain the node with some Kafka or ZooKeeper pods using the
kubectl drain
command
Strimzi Drain Cleaner uses Kubernetes Admission Control features and Validating Web-hooks to find out when something tries to evict the Kafka or ZooKeeper pods.
It annotates them with the strimzi.io/manual-rolling-update
annotation which will tell Strimzi Cluster Operator that this pod needs to be restarted.
Strimzi will roll it in the next reconciliation using its algorithms which make sure the cluster is available.
This is supported from Strimzi 0.21.0.
By default, the Drain Cleaner drains Kafka and ZooKeeper pods.
If you want to use the Drain Cleaner with only one of them, you can edit the Deployment
by setting the STRIMZI_DRAIN_KAFKA
or STRIMZI_DRAIN_ZOOKEEPER
environment variables to false
.
On OpenShift, you can have the certificates needed for the web-hook generated automatically and injected into the pod / web-hook configuration.
To install the Drain Cleaner on OpenShift, use the ./install/openshift
directory:
kubectl apply -f ./install/openshift
On Kubernetes, when you use Cert Manager, you can have the certificates needed for the web-hook generated automatically and injected into the pod / web-hook configuration.
To install the Drain Cleaner on Kubernetes with installed CertManager, use the ./install/certmanager
directory:
kubectl apply -f ./install/certmanager
On Kubernetes, when you do not use Cert Manager, the certificates needed for the web-hook need to be generated manually.
Follow the instructions in the ./install/kubernetes
directory to generate and install the certificates.
On Kubernetes, you can also use Helm to install Strimzi Drain Cleaner using our Helm Chart. The Helm Chart can be used to install it both with Cert Manager support as well as with your own certificates.
By default, the Drain Cleaner deployment is watching the Kubernetes secret with TLS certificates for changes such as certificate renewals.
If it detects such change, it will restart itself to reload the TLS certificate.
The Drain Cleaner installation files enable this by default.
But you can disable this by setting the STRIMZI_CERTIFICATE_WATCH_ENABLED
environment variable to false
.
When enabled, can also use the following environment variables to configure the detailed behavior:
Environment Variable | Description | Default |
---|---|---|
STRIMZI_CERTIFICATE_WATCH_ENABLED |
Enables or disables the certificate watch | false |
STRIMZI_CERTIFICATE_WATCH_NAMESPACE |
The namespace where the Drain Cleaner is deployed and where the certificate secret exists | strimzi-drain-cleaner |
STRIMZI_CERTIFICATE_WATCH_POD_NAME |
The Drain Cleaner Pod name | |
STRIMZI_CERTIFICATE_WATCH_SECRET_NAME |
The name of the secret with TLS certificates | strimzi-drain-cleaner |
STRIMZI_CERTIFICATE_WATCH_SECRET_KEYS |
The list of fields inside the secret which contain the TLS certificates | tls.crt,tls.key |
The best way to configure STRIMZI_CERTIFICATE_WATCH_NAMESPACE
and STRIMZI_CERTIFICATE_WATCH_POD_NAME
is using the Kubernetes Downward API.
You can easily test how it works:
- Install Strimzi on your cluster
- Deploy Kafka cluster with Pod Disruption Budget configuration having
maxUnavailable
set to0
as shown in the example above - Install the Drain Cleaner
- Drain one of the Kubernetes nodes with one of the Kafka or ZooKeeper pods
kubectl drain <worker-node> --delete-emptydir-data --ignore-daemonsets --timeout=6000s --force
- Watch how it works:
- The
kubetl drain
command will wait for the Kafka / ZooKeeper to be drained - The Drain Cleaner log should show how it gets the eviction events
- Strimzi Cluster Operator log should show how it rolls the pods which are being evicted
- The
If you encounter any issues while using Strimzi, you can get help using:
You can contribute by raising any issues you find and/or fixing issues by opening Pull Requests. All bugs, tasks or enhancements are tracked as GitHub issues.
The development documentation describe how to build, test and release Strimzi Drain Cleaner.
Strimzi is licensed under the Apache License, Version 2.0