Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/redis-cluster] Update script uses old node IP address #4064

Closed
tpolekhin opened this issue Oct 20, 2020 · 16 comments · Fixed by #4362
Closed

[bitnami/redis-cluster] Update script uses old node IP address #4064

tpolekhin opened this issue Oct 20, 2020 · 16 comments · Fixed by #4362

Comments

@tpolekhin
Copy link

Which chart:
redis-cluster-3.2.8

Describe the bug
When helm upgrade is run with cluster scale-up all redis pods are rolled in StatefulSet because env variable with nodes is changed. Cluster update Kubernetes job resolves pod dns name to IP address only once, so when a pod is rolled and it's IP address changes, update job can no longer perform any operations

To Reproduce
Steps to reproduce the behavior:

  1. helm install redis-cluster bitnami/redis-cluster --set 'cluster.replicas=0'
  2. helm upgrade redis-cluster bitnami/redis-cluster --set 'cluster.nodes=6,cluster.replicas=0,cluster.init=false,cluster.update.addNodes=true,cluster.update.currentNumberOfNodes=3'
  3. tail the logs of the cluster-update pod
  4. See error
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out

Expected behavior
Cluster update job should account for pods IP address change during the rolling upgrade process

Version of Helm and Kubernetes:

  • Output of helm version:
version.BuildInfo{Version:"v3.3.3", GitCommit:"55e3ca022e40fe200fbc855938995f40b2a68ce0", GitTreeState:"dirty", GoVersion:"go1.15.2"}
  • Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T23:30:39Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.12+IKS", GitCommit:"d09005b98837bb6061c0f643a27383c02b003205", GitTreeState:"clean", BuildDate:"2020-09-16T21:47:16Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Additional context
redis-cluster-0 pod had an IP address of 172.21.190.236 before the upgrade
during the rolling upgrade of the StatefulSet it's IP address changed to 172.21.190.237
cluster update job never finished

Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
Add-node 5 172.21.233.222 failed, retrying
@javsalgar
Copy link
Contributor

Hi,

I was unable to reproduce the issue. However, I see that the default number of nodes is 6 (not 3), so the command above should specify a different number of nodes (like 9) in order to perform the upgrade. Could you confirm that's the case?

@tpolekhin
Copy link
Author

@javsalgar I didn't want to paste entire values override here, just to make it more readable, but I deployed with 3 nodes 0 replicas initially, and then scaled up to 6 nodes. Here's full overrides file I used for helm install:

cluster:
  nodes: 3
  replicas: 0
  # init: false
  # update:
  #   addNodes: true
  #   currentNumberOfNodes: 3

updateJob:
  activeDeadlineSeconds: 1800
initJob:
  activeDeadlineSeconds: 1800

usePassword: false

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

metrics:
  enabled: true
  serviceMonitor:
    enabled: true

Values I used for helm upgrade

cluster:
  nodes: 6
  replicas: 0
  init: false
  update:
    addNodes: true
    currentNumberOfNodes: 3

updateJob:
  activeDeadlineSeconds: 1800
initJob:
  activeDeadlineSeconds: 1800

usePassword: false

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

metrics:
  enabled: true
  serviceMonitor:
    enabled: true

@javsalgar
Copy link
Contributor

Hi,

I believe that the minimum of nodes for the cluster to work properly is 6. That could explain the issues of scaling from 3 to 6. However, could you try again going from 6 to 9 to see if the same issue happens?

@javsalgar
Copy link
Contributor

However, I would like @miguelaeh to confirm this

@srfaytkn
Copy link

srfaytkn commented Oct 23, 2020

same issue
-Chart version 3.2.8

@miguelaeh
Copy link
Contributor

Hi,
I want to let you know I am checking the issue. I have two concerns about this:

  1. What @javsalgar said about 6 nodes is true, 6 nodes are required to create the cluster properly
  2. I am not totally sure about what could be the implications of set cluster.replicas=0.

@tpolekhin
Copy link
Author

@miguelaeh Hello,
Thank you for looking into this. Let me try to explain our use case a little bit.

We're using Redis as a cache layer for an application, and we use a cluster of only master nodes with sharding (slots) to distribute the data across different master nodes.
We're not interested in data redundancy, so we don't use followers (replicas) and we don't care if we loose a part of cache if a master goes down, if it will return in a reasonable amount of time. Thats why we also disabled any type of persistence to the PVC volume, except for the config.

I would imagine 6 nodes rule apply if you expect to get 3 master nodes and 3 followers (1 for each master), but in our case a 3 node cluster was totally fine, because we only use master nodes.

@miguelaeh
Copy link
Contributor

Hi @tpolekhin ,
Thank you for the explanation.
There is one more thing I am worry about using that approach, that is, if you disable persistence, I would say the nodes will not be able to reconnect after a pod dies, since they have a file called nodes.sh that is an associative array saving the pods IPs and node ids to be able to reconnect after a pod failure. This is a logic we added to the image because there is not any other way to do it. (At least we didn't find other way)

Regarding use only 3 masters, I am not totally sure but it could work, from the official docs:

Note that the minimal cluster that works as expected requires to contain at least three master nodes. For your first tests it is strongly suggested to start a six nodes cluster with three masters and three slaves.

So it seems it is a recommendation, not a mandatory thing to have 3 replicas too.

@tpolekhin
Copy link
Author

@miguelaeh we still use PVCs to keep the config, but we have disabled any type of Redis data persistence via the config:

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

Regarding running only 3 masters:

Note that the minimal cluster that works as expected requires to contain at least three master nodes.

I was able to successfully create and test this configuration with our load testing tool, so it works as expected, even when I delete one of the master pods it comes back and joins the cluster fine

@miguelaeh
Copy link
Contributor

Ok, I understand the issue.
In the container logic we maintain an associative array between nodes id and IPs, that, every time a pod is restarted should be changing the ips by the new ones.
What do you mean exactly with:

Cluster update Kubernetes job resolves pod dns name to IP address only once

?
Maybe we are missing an scenario in that logic

@tpolekhin
Copy link
Author

This issue is for a cluster-update Kubernetes job pod, that is used to add new Redis nodes to the existing cluster when you run a scale up upgrade.
The issue that I saw is that adding a node is not possible until all pods in the Statefulset are running and ready, but because upgrade operation is changing an environment variable in Statefulset with the list of all nodes, all of the pods in Statefulset are re-created via rolling upgrade strategy.
Upgrade job resolves a dns name for redis-cluster-0 pod into an IP address and tries to use that to run Redis-cli commands against the cluster.
The issue happens when redis-cluster-0 pod is rolled by the StatefulSet and gets a new IP address, but upgrade job is still using old redis-cluster-0 pod IP address and basically stuck in retry loop

@miguelaeh
Copy link
Contributor

miguelaeh commented Oct 29, 2020

Hi @tpolekhin ,
I understand your point. Nevertheless, we are running the job using a post-upgrade hook, which means it will be run after the upgrade. The job waits until the pod -0 has an ip to execute commands so, since it is after the upgrade it should not be taking the previous IP.
Am I missing something here?

@tpolekhin
Copy link
Author

@miguelaeh not exactly, I'm seeing that helm post-upgrade hook starts as soon as StatrfulSet has been modified, not after all pods in StatefulSet were rolled. Since StatefulSet rolls pods in a reverse order, pod -0 is rolled last, and by that time upgrade job already resolved it's IP address and uses it to perform operations.

@miguelaeh
Copy link
Contributor

miguelaeh commented Oct 30, 2020

Hi @tpolekhin ,
I see, then we need to figure out how to wait until the pod -0 is upgraded instead of wait for the IP to avoid this issue.
Do you have any suggestions?

@miguelaeh miguelaeh added the on-hold Issues or Pull Requests with this label will never be considered stale label Oct 30, 2020
@tpolekhin
Copy link
Author

tpolekhin commented Nov 2, 2020

@miguelaeh I honestly do not understand why you need to do this dns resolve in a first place, why just not use redis-cluster-0:6379?
If you want to wait until all pod operations are completed you could use something like

kubectl rollout status sts/redis-cluster -w

this would watch the rolling upgrade of the StatefulSet and exit when it's done
After that you can start resolving DNS names for pods and execute operations

@miguelaeh
Copy link
Contributor

miguelaeh commented Nov 3, 2020

Hi @tpolekhin ,

why just not use redis-cluster-0:6379?

That is the pod name it will not be resolved to an IP, in any case we could use something like {{ template "common.names.fullname" . }}-"$new_node_index".{{ template "common.names.fullname" . }}-headless
The problem with that approach is that redis does not support the use of hostnames to join nodes: redis/redis#2186

So I guess we can add that command you shared before the DNS resolution to wait until the deployment is upgraded

tohutohu added a commit to tohutohu/charts that referenced this issue Nov 13, 2020
javsalgar added a commit that referenced this issue Dec 28, 2020
…eIp IP address again (#4362)

* fix #4064

* bump chart version

Co-authored-by: Javier J. Salmerón-García <[email protected]>
@carrodher carrodher removed the on-hold Issues or Pull Requests with this label will never be considered stale label Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants