etcd snapshot cleanup fails if node name changes #3714

thirdeyenick · 2022-12-15T13:17:34Z

Environmental Info:

RKE2 Version:
rke2 version v1.21.14+rke2r1 (514ae51)
go version go1.16.14b7

Node(s) CPU architecture, OS, and Version:

Linux testmachine 5.15.70-flatcar rancher/rancher#1 SMP Thu Oct 27 12:53:14 -00 2022 x86_64 Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz GenuineIntel GNU/Linux

Cluster Configuration:
We have multiple rke2 clusters, but all of them have at least 3 control plane nodes and multiple workers

Describe the bug:
We have multiple rke2 clusters and all of them have automatic etcd snapshots enabled (taken every 5 hours). We also configured s3 uploading of those snapshots. Recently, we found that no s3 snapshots are uploaded anymore. We investigated the issue and found the following rke2-server output:

Dec 14 05:27:01 testmachine[1665]: time="2022-12-14T05:27:01Z" level=error msg="failed to save local snapshot data to configmap: ConfigMap \"rke2-etcd-snapshots\" is invalid: []: Too long: must have at most 1048576 bytes"

I checked the code and found that rke2 is leveraging the etcd snapshot capabilities from k3s for this. A function is executed periodically on all control plane nodes. The function takes local snapshots, uploads them to s3 (if configured) and also reconciles a configmap which contains all snapshots and metadata about them. Looking at the code it seems that the reconcilation of that "sync" configmap is based on the name of the node which executes the etcd snapshot. Same goes for the s3 retention functions (only old objects which contain the node name will be cleaned up). As we are replacing all our nodes in the clusters whenever there is a new flatcar version, the node names change quite often. This leads to orphaned entries in the config map and also orphaned objects in the s3 buckets (although this could be worked around with a lifecycle policy).

Are there any ideas what could be done to fix this?

I found this bug report which describes the too large configmap in the rancher repo.

Steps To Reproduce:

Enable etcd snapshots and s3 uploading. After replacing the control plane nodes with new machines (new names), there will be orphaned entries in the 'rke2-etcd-snapshots' configmap. Whenever the configmap grew too large, no new snapshots will be uploaded to s3 anymore.

Expected behavior:
The sync configmap only contains the snapshots of the current nodes of the clusters and removes all other ones.

The text was updated successfully, but these errors were encountered:

brandond · 2022-12-16T01:39:50Z

I'll talk this over with the team. On the S3 side, the correct behavior is probably to retain N snapshots regardless of whether or not they match the current node name.

cc @briandowns @cwayne18

brandond · 2023-04-03T19:19:15Z

Still needs to be worked.

riuvshyn · 2023-06-12T07:11:11Z

It is not only fails to upload new backups but also filling up masters disc space with local snapshots which are not cleaned up when CM grows too large and fails to apply which leads to an incident as it puts master nodes to disc pressure.

brandond · 2023-07-28T20:08:11Z

@riuvshyn we are working this separate from the snapshot list configmap issue. This issue will serve only to track the issue of snapshot cleanup only handling snapshots whose name contains the current node's hostname.

vitorsavian · 2023-08-02T17:03:29Z

/backport v1.26.8+rke2r1

vitorsavian · 2023-08-02T17:03:45Z

/backport v1.25.13+rke2r1

vitorsavian · 2023-08-02T17:08:16Z

/backport v1.24.17+rke2r1

aganesh-suse · 2023-08-14T23:35:54Z

Validated on master branch with commit `c3ec545`

Environment Details

Infrastructure

Cloud
Hosted

Node(s) CPU architecture, OS, and Version:

cat /etc/os-release | grep PRETTY
PRETTY_NAME="Ubuntu 22.04.2 LTS"

Cluster Configuration:

Server config: 3 etcd, control planes servers/1 agent config

Config.yaml:

Main ETCD SERVER (+CONTROL PLANE) CONFIG:

token: blah
node-name: "server1"
etcd-snapshot-retention: 2
etcd-snapshot-schedule-cron: "* * * * *"
etcd-s3: true
etcd-s3-access-key: xxx
etcd-s3-secret-key: xxx
etcd-s3-bucket: s3-bucket-name
etcd-s3-folder: rke2snap/commit-setup
etcd-s3-region: us-east-2
write-kubeconfig-mode: "0644"

Sample Secondary Etcd, control plane config.yaml:

token: blah
server: https://x.x.x.x:9345
node-name: "server3"
write-kubeconfig-mode: "0644"

AGENT CONFIG:

token: blah
server: https://x.x.x.x:9345
node-name: "agent1"

Additional files

Testing Steps

Create config dir and place the config.yaml file in server/agent nodes:

$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2

Note: First round node-names:
<version|commit>-server1
server2
server3
agent1
2. Install RKE2:
Using Commit:

curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_COMMIT='c3ec545e153916bee18b2ce0fc000eb538a0790d' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -

Using Version:

curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_VERSION='v1.27.4+rke2r1' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -

Wait for 2 minutes.
Note: The snapshot gets created every 1 minute (etcd-snapshot-schedule-cron: "* * * * *") . Retention is for 2 snapshots (etcd-snapshot-retention: 2).
Reference for cron job format: https://cloud.google.com/scheduler/docs/configuring/cron-job-schedules
After 2 minutes: 2 snapshots are created with name etcd-snapshot-server1-2-xxxx if node-name: server1-2 in config.yaml),
Check outputs of:

sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots

sudo rke2 etcd-snapshots list

4a. Also check the s3 bucket/folder in aws to see the snapshots listed.
5. Update the node-name in the config.yaml:
node-names:
<version|commit>-server1-<|suffix1>
server2-<|suffix1>
server3-<|suffix1>
agent1-<|suffix1>
6. restart the rke2 service for all nodes.

sudo systemctl restart rke2-server

Wait for 2 more minutes and check the snapshot list:

sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots

sudo rke2 etcd-snapshots list

7a. Also check the s3 bucket/folder in aws to see the snapshots listed.

Repeat steps 5 through 7 once more.
node names:
<version|commit>-server1-<|suffix1>-<|suffix2>
server2-<|suffix1>-<|suffix2>
server3-<|suffix1>-<|suffix2>
agent1-<|suffix1>-<|suffix2>

Replication Results:

rke2 version used for replication:

SETUP:

$ rke2 -v
rke2 version v1.27.4+rke2r1 (3aaa57a9608206d95eeb9ce3f79c0ec2ea912b20)
go version go1.20.5 X:boringcrypto

Node-names in order of update for the main etcd server:

version-setup-server1               
version-setup-server1-25477
version-setup-server1-24232-25477

Final output of snapshot list - after multiple node name changes:

$ sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots 
total 64320
-rw------- 1 root root  8151072 Aug 11 18:10 etcd-snapshot-version-setup-server1-1691777403
-rw------- 1 root root  8364064 Aug 11 18:11 etcd-snapshot-version-setup-server1-1691777464
-rw------- 1 root root 14946336 Aug 11 18:27 etcd-snapshot-version-setup-server1-25477-1691778420
-rw------- 1 root root 14946336 Aug 11 18:28 etcd-snapshot-version-setup-server1-25477-1691778483
-rw------- 1 root root  9715744 Aug 11 18:43 etcd-snapshot-version-setup-server1-24232-25477-1691779380
-rw------- 1 root root  9715744 Aug 11 18:44 etcd-snapshot-version-setup-server1-24232-25477-1691779444

$ sudo rke2 etcd-snapshot list 
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --token found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --etcd-snapshot-retention found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --etcd-snapshot-schedule-cron found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --write-kubeconfig-mode found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=info msg="Checking if S3 bucket xxx exists"
time="2023-08-11T18:44:54Z" level=info msg="S3 bucket xxx exists"
Name                                                       Size     Created
etcd-snapshot-version-setup-server1-1691777403             8151072  2023-08-11T18:10:05Z
etcd-snapshot-version-setup-server1-1691777464             8364064  2023-08-11T18:11:05Z
etcd-snapshot-version-setup-server1-24232-25477-1691779380 9715744  2023-08-11T18:43:02Z
etcd-snapshot-version-setup-server1-24232-25477-1691779444 9715744  2023-08-11T18:44:05Z
etcd-snapshot-version-setup-server1-25477-1691778420       14946336 2023-08-11T18:27:01Z
etcd-snapshot-version-setup-server1-25477-1691778483       14946336 2023-08-11T18:28:05Z

As we can see above, previous snapshots with different node-names are still listed and not cleaned-up.

Validation Results:

rke2 version used for validation:

rke2 -v
rke2 version v1.27.4+dev.c3ec545e (c3ec545e153916bee18b2ce0fc000eb538a0790d)
go version go1.20.5 X:boringcrypto

Node names in order of update for the main etcd server:

commit-setup-server1              
commit-setup-server1-23678        
commit-setup-server1-6695-23678

After updating node-names 2 times, the snapshots listed are:

$ sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots
total 31928
-rw------- 1 root root 16343072 Aug 14 23:26 etcd-snapshot-commit-setup-server1-6695-23678-1692055563
-rw------- 1 root root 16343072 Aug 14 23:27 etcd-snapshot-commit-setup-server1-6695-23678-1692055620

$ sudo rke2 etcd-snapshot list
WARN[0000] Unknown flag --token found in config.yaml, skipping
WARN[0000] Unknown flag --etcd-snapshot-retention found in config.yaml, skipping
WARN[0000] Unknown flag --etcd-snapshot-schedule-cron found in config.yaml, skipping
WARN[0000] Unknown flag --write-kubeconfig-mode found in config.yaml, skipping
WARN[0000] Unknown flag --cni found in config.yaml, skipping
INFO[0000] Checking if S3 bucket xxx exists
INFO[0000] S3 bucket xxx exists
Name                                                     Size     Created
etcd-snapshot-commit-setup-server1-6695-23678-1692055681 16343072 2023-08-14T23:28:03Z
etcd-snapshot-commit-setup-server1-6695-23678-1692055620 16343072 2023-08-14T23:27:02Z

As we can see, the previous snapshots with old node-names are no longer retained and get cleaned up.

brandond added this to the v1.26.1+rke2r1 milestone Dec 16, 2022

brandond added the area/etcd-snapshot-restore label Dec 16, 2022

brandond mentioned this issue Jan 6, 2023

ETCD S3 Backup File Name/Prefix Override #3610

Closed

caroline-suse-rancher modified the milestones: v1.26.1+rke2r1, v1.26.3+rke2r1 Feb 21, 2023

caroline-suse-rancher modified the milestones: v1.26.3+rke2r1, v1.26.4+rke2r1 Apr 3, 2023

caroline-suse-rancher modified the milestones: v1.26.4+rke2r1, v1.26.6+rke2r1 Apr 24, 2023

caroline-suse-rancher modified the milestones: v1.26.6+rke2r1, v1.26.8+rke2r1 Jul 10, 2023

caroline-suse-rancher assigned vitorsavian Jul 19, 2023

vitorsavian mentioned this issue Aug 2, 2023

Etcd snapshots retention when node name changes k3s-io/k3s#8099

Merged

rancherbot mentioned this issue Aug 2, 2023

[Backport release-1.26] etcd snapshot cleanup fails if node name changes #4536

Closed

rancherbot mentioned this issue Aug 2, 2023

[Backport release-1.25] etcd snapshot cleanup fails if node name changes #4537

Closed

rancherbot mentioned this issue Aug 2, 2023

[Backport release-1.24] etcd snapshot cleanup fails if node name changes #4538

Closed

rancher-max assigned aganesh-suse Aug 4, 2023

rancher-max modified the milestones: v1.26.8+rke2r1, v1.27.5+rke2r1 Aug 4, 2023

vitorsavian mentioned this issue Aug 12, 2023

Fixed the etcd retention to delete orphaned snapshots based on the date k3s-io/k3s#8177

Merged

aganesh-suse closed this as completed Aug 14, 2023

brandond mentioned this issue Jan 8, 2024

S3 Snapshot Retention policy is too aggressive #5216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etcd snapshot cleanup fails if node name changes #3714

etcd snapshot cleanup fails if node name changes #3714

thirdeyenick commented Dec 15, 2022 •

edited

Loading

brandond commented Dec 16, 2022

brandond commented Apr 3, 2023

riuvshyn commented Jun 12, 2023

brandond commented Jul 28, 2023

vitorsavian commented Aug 2, 2023

vitorsavian commented Aug 2, 2023

vitorsavian commented Aug 2, 2023

aganesh-suse commented Aug 14, 2023 •

edited

Loading

etcd snapshot cleanup fails if node name changes #3714

etcd snapshot cleanup fails if node name changes #3714

Comments

thirdeyenick commented Dec 15, 2022 • edited Loading

brandond commented Dec 16, 2022

brandond commented Apr 3, 2023

riuvshyn commented Jun 12, 2023

brandond commented Jul 28, 2023

vitorsavian commented Aug 2, 2023

vitorsavian commented Aug 2, 2023

vitorsavian commented Aug 2, 2023

aganesh-suse commented Aug 14, 2023 • edited Loading

Validated on master branch with commit c3ec545

Environment Details

Testing Steps

thirdeyenick commented Dec 15, 2022 •

edited

Loading

aganesh-suse commented Aug 14, 2023 •

edited

Loading

Validated on master branch with commit `c3ec545`