[bitnami/redis-cluster] Update script uses old node IP address #4064

tpolekhin · 2020-10-20T17:34:49Z

Which chart:
redis-cluster-3.2.8

Describe the bug
When helm upgrade is run with cluster scale-up all redis pods are rolled in StatefulSet because env variable with nodes is changed. Cluster update Kubernetes job resolves pod dns name to IP address only once, so when a pod is rolled and it's IP address changes, update job can no longer perform any operations

To Reproduce
Steps to reproduce the behavior:

helm install redis-cluster bitnami/redis-cluster --set 'cluster.replicas=0'
helm upgrade redis-cluster bitnami/redis-cluster --set 'cluster.nodes=6,cluster.replicas=0,cluster.init=false,cluster.update.addNodes=true,cluster.update.currentNumberOfNodes=3'
tail the logs of the cluster-update pod
See error

>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out

Expected behavior
Cluster update job should account for pods IP address change during the rolling upgrade process

Version of Helm and Kubernetes:

Output of helm version:

version.BuildInfo{Version:"v3.3.3", GitCommit:"55e3ca022e40fe200fbc855938995f40b2a68ce0", GitTreeState:"dirty", GoVersion:"go1.15.2"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T23:30:39Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.12+IKS", GitCommit:"d09005b98837bb6061c0f643a27383c02b003205", GitTreeState:"clean", BuildDate:"2020-09-16T21:47:16Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Additional context
redis-cluster-0 pod had an IP address of 172.21.190.236 before the upgrade
during the rolling upgrade of the StatefulSet it's IP address changed to 172.21.190.237
cluster update job never finished

Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Add-node 5 172.21.233.222 failed, retrying
>>> Adding node 172.21.233.222:6379 to cluster 172.21.190.236:6379
Could not connect to Redis at 172.21.190.236:6379: Connection timed out
Add-node 5 172.21.233.222 failed, retrying

The text was updated successfully, but these errors were encountered:

javsalgar · 2020-10-21T08:08:50Z

Hi,

I was unable to reproduce the issue. However, I see that the default number of nodes is 6 (not 3), so the command above should specify a different number of nodes (like 9) in order to perform the upgrade. Could you confirm that's the case?

tpolekhin · 2020-10-21T08:43:22Z

@javsalgar I didn't want to paste entire values override here, just to make it more readable, but I deployed with 3 nodes 0 replicas initially, and then scaled up to 6 nodes. Here's full overrides file I used for helm install:

cluster:
  nodes: 3
  replicas: 0
  # init: false
  # update:
  #   addNodes: true
  #   currentNumberOfNodes: 3

updateJob:
  activeDeadlineSeconds: 1800
initJob:
  activeDeadlineSeconds: 1800

usePassword: false

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

metrics:
  enabled: true
  serviceMonitor:
    enabled: true

Values I used for helm upgrade

cluster:
  nodes: 6
  replicas: 0
  init: false
  update:
    addNodes: true
    currentNumberOfNodes: 3

updateJob:
  activeDeadlineSeconds: 1800
initJob:
  activeDeadlineSeconds: 1800

usePassword: false

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

metrics:
  enabled: true
  serviceMonitor:
    enabled: true

javsalgar · 2020-10-23T07:47:58Z

Hi,

I believe that the minimum of nodes for the cluster to work properly is 6. That could explain the issues of scaling from 3 to 6. However, could you try again going from 6 to 9 to see if the same issue happens?

javsalgar · 2020-10-23T07:48:56Z

However, I would like @miguelaeh to confirm this

srfaytkn · 2020-10-23T17:59:54Z

same issue
-Chart version 3.2.8

miguelaeh · 2020-10-26T11:12:41Z

Hi,
I want to let you know I am checking the issue. I have two concerns about this:

What @javsalgar said about 6 nodes is true, 6 nodes are required to create the cluster properly
I am not totally sure about what could be the implications of set cluster.replicas=0.

tpolekhin · 2020-10-26T12:24:08Z

@miguelaeh Hello,
Thank you for looking into this. Let me try to explain our use case a little bit.

We're using Redis as a cache layer for an application, and we use a cluster of only master nodes with sharding (slots) to distribute the data across different master nodes.
We're not interested in data redundancy, so we don't use followers (replicas) and we don't care if we loose a part of cache if a master goes down, if it will return in a reasonable amount of time. Thats why we also disabled any type of persistence to the PVC volume, except for the config.

I would imagine 6 nodes rule apply if you expect to get 3 master nodes and 3 followers (1 for each master), but in our case a 3 node cluster was totally fine, because we only use master nodes.

miguelaeh · 2020-10-27T09:44:22Z

Hi @tpolekhin ,
Thank you for the explanation.
There is one more thing I am worry about using that approach, that is, if you disable persistence, I would say the nodes will not be able to reconnect after a pod dies, since they have a file called nodes.sh that is an associative array saving the pods IPs and node ids to be able to reconnect after a pod failure. This is a logic we added to the image because there is not any other way to do it. (At least we didn't find other way)

Regarding use only 3 masters, I am not totally sure but it could work, from the official docs:

Note that the minimal cluster that works as expected requires to contain at least three master nodes. For your first tests it is strongly suggested to start a six nodes cluster with three masters and three slaves.

So it seems it is a recommendation, not a mandatory thing to have 3 replicas too.

tpolekhin · 2020-10-27T09:51:29Z

@miguelaeh we still use PVCs to keep the config, but we have disabled any type of Redis data persistence via the config:

redis:
  useAOFPersistence: "no"
  configmap: |-
    save ""
    cluster-require-full-coverage no
    cluster-allow-reads-when-down yes

Regarding running only 3 masters:

Note that the minimal cluster that works as expected requires to contain at least three master nodes.

I was able to successfully create and test this configuration with our load testing tool, so it works as expected, even when I delete one of the master pods it comes back and joins the cluster fine

miguelaeh · 2020-10-28T09:19:43Z

Ok, I understand the issue.
In the container logic we maintain an associative array between nodes id and IPs, that, every time a pod is restarted should be changing the ips by the new ones.
What do you mean exactly with:

Cluster update Kubernetes job resolves pod dns name to IP address only once

?
Maybe we are missing an scenario in that logic

tpolekhin · 2020-10-28T09:54:29Z

This issue is for a cluster-update Kubernetes job pod, that is used to add new Redis nodes to the existing cluster when you run a scale up upgrade.
The issue that I saw is that adding a node is not possible until all pods in the Statefulset are running and ready, but because upgrade operation is changing an environment variable in Statefulset with the list of all nodes, all of the pods in Statefulset are re-created via rolling upgrade strategy.
Upgrade job resolves a dns name for redis-cluster-0 pod into an IP address and tries to use that to run Redis-cli commands against the cluster.
The issue happens when redis-cluster-0 pod is rolled by the StatefulSet and gets a new IP address, but upgrade job is still using old redis-cluster-0 pod IP address and basically stuck in retry loop

miguelaeh · 2020-10-29T08:54:23Z

Hi @tpolekhin ,
I understand your point. Nevertheless, we are running the job using a post-upgrade hook, which means it will be run after the upgrade. The job waits until the pod -0 has an ip to execute commands so, since it is after the upgrade it should not be taking the previous IP.
Am I missing something here?

tpolekhin · 2020-10-29T10:49:58Z

@miguelaeh not exactly, I'm seeing that helm post-upgrade hook starts as soon as StatrfulSet has been modified, not after all pods in StatefulSet were rolled. Since StatefulSet rolls pods in a reverse order, pod -0 is rolled last, and by that time upgrade job already resolved it's IP address and uses it to perform operations.

miguelaeh · 2020-10-30T13:17:25Z

Hi @tpolekhin ,
I see, then we need to figure out how to wait until the pod -0 is upgraded instead of wait for the IP to avoid this issue.
Do you have any suggestions?

tpolekhin · 2020-11-02T11:14:07Z

@miguelaeh I honestly do not understand why you need to do this dns resolve in a first place, why just not use redis-cluster-0:6379?
If you want to wait until all pod operations are completed you could use something like

kubectl rollout status sts/redis-cluster -w

this would watch the rolling upgrade of the StatefulSet and exit when it's done
After that you can start resolving DNS names for pods and execute operations

miguelaeh · 2020-11-03T09:01:49Z

Hi @tpolekhin ,

why just not use redis-cluster-0:6379?

That is the pod name it will not be resolved to an IP, in any case we could use something like {{ template "common.names.fullname" . }}-"$new_node_index".{{ template "common.names.fullname" . }}-headless
The problem with that approach is that redis does not support the use of hostnames to join nodes: redis/redis#2186

So I guess we can add that command you shared before the DNS resolution to wait until the deployment is upgraded

…eIp IP address again (#4362) * fix #4064 * bump chart version Co-authored-by: Javier J. Salmerón-García <[email protected]>

miguelaeh added the on-hold Issues or Pull Requests with this label will never be considered stale label Oct 30, 2020

tpolekhin mentioned this issue Nov 2, 2020

[bitnami/redis-cluster] update job fails to get new node IP address #3876

Closed

tohutohu added a commit to tohutohu/charts that referenced this issue Nov 13, 2020

fix bitnami#4064

1b1e8d7

tohutohu mentioned this issue Nov 13, 2020

[bitnami/redis-cluster] When retrying to add a node, get the firstNodeIp IP address again #4362

Merged

2 tasks

javsalgar closed this as completed in #4362 Dec 28, 2020

javsalgar added a commit that referenced this issue Dec 28, 2020

[bitnami/redis-cluster] When retrying to add a node, get the firstNod…

0859dcc

…eIp IP address again (#4362) * fix #4064 * bump chart version Co-authored-by: Javier J. Salmerón-García <[email protected]>

carrodher removed the on-hold Issues or Pull Requests with this label will never be considered stale label Dec 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bitnami/redis-cluster] Update script uses old node IP address #4064

[bitnami/redis-cluster] Update script uses old node IP address #4064

tpolekhin commented Oct 20, 2020

javsalgar commented Oct 21, 2020

tpolekhin commented Oct 21, 2020

javsalgar commented Oct 23, 2020

javsalgar commented Oct 23, 2020

srfaytkn commented Oct 23, 2020 •

edited

Loading

miguelaeh commented Oct 26, 2020

tpolekhin commented Oct 26, 2020

miguelaeh commented Oct 27, 2020

tpolekhin commented Oct 27, 2020

miguelaeh commented Oct 28, 2020

tpolekhin commented Oct 28, 2020

miguelaeh commented Oct 29, 2020 •

edited

Loading

tpolekhin commented Oct 29, 2020

miguelaeh commented Oct 30, 2020 •

edited

Loading

tpolekhin commented Nov 2, 2020 •

edited

Loading

miguelaeh commented Nov 3, 2020 •

edited

Loading

[bitnami/redis-cluster] Update script uses old node IP address #4064

[bitnami/redis-cluster] Update script uses old node IP address #4064

Comments

tpolekhin commented Oct 20, 2020

javsalgar commented Oct 21, 2020

tpolekhin commented Oct 21, 2020

javsalgar commented Oct 23, 2020

javsalgar commented Oct 23, 2020

srfaytkn commented Oct 23, 2020 • edited Loading

miguelaeh commented Oct 26, 2020

tpolekhin commented Oct 26, 2020

miguelaeh commented Oct 27, 2020

tpolekhin commented Oct 27, 2020

miguelaeh commented Oct 28, 2020

tpolekhin commented Oct 28, 2020

miguelaeh commented Oct 29, 2020 • edited Loading

tpolekhin commented Oct 29, 2020

miguelaeh commented Oct 30, 2020 • edited Loading

tpolekhin commented Nov 2, 2020 • edited Loading

miguelaeh commented Nov 3, 2020 • edited Loading

srfaytkn commented Oct 23, 2020 •

edited

Loading

miguelaeh commented Oct 29, 2020 •

edited

Loading

miguelaeh commented Oct 30, 2020 •

edited

Loading

tpolekhin commented Nov 2, 2020 •

edited

Loading

miguelaeh commented Nov 3, 2020 •

edited

Loading