Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[elasticsearch] New readiness probes causing full cluster-restart #631

Closed
ckotzbauer opened this issue May 18, 2020 · 10 comments · Fixed by #638
Closed

[elasticsearch] New readiness probes causing full cluster-restart #631

ckotzbauer opened this issue May 18, 2020 · 10 comments · Fixed by #638
Labels
bug Something isn't working elasticsearch

Comments

@ckotzbauer
Copy link
Contributor

Chart version: 7.7.0

Kubernetes version: 1.17.1

Kubernetes provider: On-prem

Helm Version: 3.2.0

helm get release output

Output of helm get release
USER VALUES

esConfig:
  elasticsearch.yml: |
    action.auto_create_index: "-hd-*"
esJavaOpts: -Xmx5g -Xms5g
esMajorVersion: 6
image: "private-image-based-on-official-oss:6.4.2"
imagePullPolicy: IfNotPresent
imagePullSecrets:
- name: some-credentials
imageTag: 6.4.2
ingress:
  annotations:
    ingress.kubernetes.io/ssl-redirect: "true"
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/auth-realm: Authentication Required
    nginx.ingress.kubernetes.io/auth-secret: dev-elasticsearch-ingress-auth
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/proxy-body-size: 60m
  enabled: true
  hosts:
  - ""
  path: /
  tls:
  - hosts:
    - ""
lifecycle:
  postStart:
    exec:
      command:
      - bash
      - -c
      - |
        #!/bin/bash
        cd /usr/share/elasticsearch/plugins/xxx
        /opt/jdk-10.0.2/bin/jar -cf config.jar config.cfg
        chmod 777 config.jar
persistence:
  enabled: true
podSecurityPolicy:
  create: false
rbac:
  create: true
resources:
  limits:
    cpu: 2000m
    memory: 8Gi
  requests:
    cpu: 200m
    memory: 8Gi
sysctlInitContainer:
  enabled: false
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: rook-ceph-cephfs

Describe the bug:
I updated the chart to the newest version 7.7.0 and expected that the three-elastic nodes are updated one after another, waiting until the cluster is green again. (The most recent restarted pod was not ready, until the cluster was green again in the past). Now, the pod became ready after a few minutes and kubernetes moved on too quickly, so the cluster was red and not down.

Steps to reproduce:

  1. Update the chart-installation to 7.7.0

Expected behavior:
The readiness probe is working as expected, and mark the pod as not ready, until the cluster is green again.

Any additional context:
I did the update of my release often in the past without such problems, but always today with the new version.

@fatmcgav
Copy link
Contributor

So I suspect this behaviour is related to the change to the readinessProbe as part of #586 .

Would you be able to provide the output of kubectl get events and kubectl describe <elasticsearch master pod>.

@fatmcgav fatmcgav added the bug Something isn't working label May 19, 2020
@fatmcgav
Copy link
Contributor

Also, as a general point of interest, if you're pinning to a custom 6.4.2 image, what do you aim to accomplish by bumping the chart version?

As the only real reason to bump the chart version is to pick up changes related to new versions and bug fixes.

@ckotzbauer
Copy link
Contributor Author

Yes, I also think that #586 is the reason for this.

The events are not visible anymore and I can't restart the cluster right now. If you really need them I can do it today in the evening...

I did not explicit upgrade the chart, because there was an update. But I made changes to the chart-settings yesterday and therefore implicit upgraded the chart to the newest version, as I'm not pinning the chart-version if it's not really needed.

kubectl describe masterpod
Name:           elasticsearch-master-0
Namespace:      dev-elasticsearch
Priority:       0
Node:           managed-k8s-worker-5/xx.xx.xx.xx
Start Time:     Mon, 18 May 2020 15:02:22 +0200
Labels:         app=elasticsearch-master
                chart=elasticsearch
                controller-revision-hash=elasticsearch-master-648f485fbf
                heritage=Helm
                release=dev-elasticsearch
                statefulset.kubernetes.io/pod-name=elasticsearch-master-0
Annotations:    backup.velero.io/backup-volumes: dev-backup
                configchecksum: 72b33fd5cf47279b81ec67f0bc0ccadfb3cc9eef20d3a88ecbf5274cc1a58f5
                kubernetes.io/psp: sag-default-psp
Status:         Running
IP:             xx.xx.xx.xx
Controlled By:  StatefulSet/elasticsearch-master
Containers:
  elasticsearch:
    Container ID:   docker://85abf100f7b3324dcadaac0fc50e8df8bf8bdc929e0e715932db3a0510d41344
    Image:          private-image-based-on-official-oss:6.4.2
    Image ID:       docker-pullable://xxxx
    Ports:          9200/TCP, 9300/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Mon, 18 May 2020 15:03:09 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  8Gi
    Requests:
      cpu:      200m
      memory:   8Gi
    Readiness:  exec [sh -c #!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file

if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
  BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
  BASIC_AUTH=''
fi

if [ -f "${START_FILE}" ]; then
  echo 'Elasticsearch is already running, lets check the node is healthy'
  HTTP_CODE=$(curl -XGET -s -k ${BASIC_AUTH} -o /dev/null -w '%{http_code}' http://127.0.0.1:9200/)
  RC=$?
  if [[ ${RC} -ne 0 ]]; then
    echo "curl -XGET -s -k \${BASIC_AUTH} -o /dev/null -w '%{http_code}' http://127.0.0.1:9200/ failed with RC ${RC}"
    exit ${RC}
  fi
  # ready if HTTP code 200, 503 is tolerable if ES version is 6.x
  if [[ ${HTTP_CODE} == "200" ]]; then
    exit 0
  elif [[ ${HTTP_CODE} == "503" && "6" == "6" ]]; then
    exit 0
  else
    echo "curl -XGET -s -k \${BASIC_AUTH} -o /dev/null -w '%{http_code}' http://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
    exit 1
  fi

else
  echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
  if curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200/_cluster/health?wait_for_status=green&timeout=1s ; then
    touch ${START_FILE}
    exit 0
  else
    echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
    exit 1
  fi
fi
] delay=10s timeout=5s period=10s #success=3 #failure=3
    Environment:
      node.name:                           elasticsearch-master-0 (v1:metadata.name)
      discovery.zen.minimum_master_nodes:  2
      discovery.zen.ping.unicast.hosts:    elasticsearch-master-headless
      cluster.name:                        elasticsearch
      network.host:                        0.0.0.0
      ES_JAVA_OPTS:                        -Xmx5g -Xms5g
      node.data:                           true
      node.ingest:                         true
      node.master:                         true
    Mounts:
      /usr/share/elasticsearch/backup from dev-backup (rw)
      /usr/share/elasticsearch/backup-script from dev-backup-script (ro)
      /usr/share/elasticsearch/config/elasticsearch.yml from esconfig (rw,path="elasticsearch.yml")
      /usr/share/elasticsearch/data from elasticsearch-master (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from elasticsearch-master-token-xg869 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  elasticsearch-master:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elasticsearch-master-elasticsearch-master-0
    ReadOnly:   false
  esconfig:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      elasticsearch-master-config
    Optional:  false
  dev-backup:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  dev-elasticsearch-backup
    ReadOnly:   false
  dev-backup-script:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      dev-elasticsearch-backup
    Optional:  false
  elasticsearch-master-token-xg869:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-master-token-xg869
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

@fatmcgav
Copy link
Contributor

The events are not visible anymore and I can't restart the cluster right now. If you really need them I can do it today in the evening...

Ok, I'll see if I can replicate locally, and I'll come back if need the events...

But I made changes to the chart-settings yesterday and therefore implicit upgraded the chart to the newest version, as I'm not pinning the chart-version if it's not really needed.

Ah, ok... So for now, I'd suggest pinning the chart version to a previous version so that any future config updates don't restart in a full restart.

@ckotzbauer
Copy link
Contributor Author

Yes, that's the plan. If you need more infos, I will try to get them 😉

@fatmcgav
Copy link
Contributor

@ckotzbauer OK, I've done a bit more testing locally, and I don't think the change in readinessProbe is necessarily the issue, as I was able to reproduce the loss of availability on the elasticsearch-master service by just deleting the current master pod.

That got me looking through the config and history, and I came across this issue: #63.

By setting masterTerminationFix: true, I didn't observe any service disruption when either deleting the master pod, or doing a helm upgrade to a new chart version.

Do you want to try applying that config item?

@ckotzbauer
Copy link
Contributor Author

hm, okay that's an interesting point. I know this behavior too, that the election tooks too long and the cluster is down if only the master is deleted... I can try this.

But I (personally) don't think that this causes the problem, that K8s does the rollout too fast. The new pods are marked as ready, so K8s thinks it could go on. The intention of the readinessProbe implementation in the statefulset is to prevent K8s from doing this, by delaying the readiness state until the cluster is recovered again. And that seems the point which does not work anymore. Or did I miss sth. here...? 🤔

@fatmcgav
Copy link
Contributor

@ckotzbauer So a quick update from me.

We ran some more tests internally, and were able to reproduce the "too fast" rollout on an internal 7.7.0 cluster which has a non-zero amount of data.

After a bit of digging, it looks like there was a set of quotes missed on the initial curl command that included the &, which was effectively breaking the command out early...

I've opened #638 which re-works that behaviour, and I'm running some more tests internally...

We'll probably be looking to do a patch release for this pretty quickly.

However my recommendation to you as you're running a 6.4.2 image would be to pin to the 6.8.x version of the chart.

@ckotzbauer
Copy link
Contributor Author

Thank you so much for digging in @fatmcgav. I really appreciate!
This sounds promising. I will pin my chart to the 6.8.x version and wait for the patch-release.
Thanks again, for your work!

@fatmcgav
Copy link
Contributor

@ckotzbauer The fix in #638 has been merged, and back-ported to the 6.8 and 7.7 branches for inclusion on the next minor release.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working elasticsearch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants