Scheduled config from helm chart config fails backup with ValidationError #8262

darnone · 2024-10-03T22:14:13Z

What steps did you take and what happened:

Configuring schedules from the velero helm chart:

schedules:
  Football:
    disabled: false
    schedule: " 15 2 * * *"
    template:
      ttl: "240h"
      storageLocation: default
      volumeSnapshotLocations:
      - default
      includedNamespaces:
      - "*"
      includedResources:
      - "*"
      #excludedResources: {}
      # - events
      # - events.events.k8s.io
      # - pods
      # - replicasets.apps
      # - jobs
      labelSelector: {}
      snapshotVolumes: true
      hooks: {}

Great a backup from this schedule gives a message and the backup fails;

v create backup --from schedule velero-foobar
INFO[0002] No Schedule.template.metadata.labels set - using Schedule.labels for backup object  backup=velero/velero-foobar-20241003215259 labels="map[app.kubernetes.io/instance:velero app.kubernetes.io/managed-by:Helm app.kubernetes.io/name:velero helm.sh/chart:velero-7.2.1]"
Creating backup from schedule, all other filters are ignored.
Backup request "velero-foobar-20241003215259" submitted successfully.
Run `velero backup describe velero-foobar-20241003215259` or `velero backup logs velero-foobar-20241003215259` for more details.

V get backup
NAME                                               STATUS             ERRORS   WARNINGS   CREATED   EXPIRES   STORAGE LOCATION   SELECTOR
velero-foobar-20241003215259   FailedValidation   0        0          <nil>     9d        default            <none>

But if I create the schedule manually and create a backup from that, I will still get the message gut the backup succeeds.

What did you expect to happen:
I would expect backups from helm chart schedule to run successfully and with no info message

The following information will help us better understand what's going on:
We are using aws eks, k8sv 1.28, velero v 1.14.1 velero chart version 7.2.1

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

The backup name is different but it is the same result regardless
bundle-2024-10-03-18-08-09.tar.gz

Anything else you would like to add:

Environment:

Velero version (use velero version): 1.14.1
Velero features (use velero client config get features): Not Set but I am using features: EnableCSI
Kubernetes version (use kubectl version): 1.28`
Kubernetes installer & version: AWS EKS
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Amazon Linux

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

👍 for "I would like to see this bug fixed as soon as possible"
👎 for "There are more important bugs to focus on right now"

The text was updated successfully, but these errors were encountered:

darnone · 2024-10-04T18:31:02Z

I could really use some help with this. I finally got describe on the failed backup.

Validation errors: an existing backup storage location wasn't specified at backup creation time and the default 'default' wasn't found. Please address this issue (see velero backup-location -h for options) and create a new backup. Error: a VolumeSnapshotLocation CRD for the location default with the name specified in the backup spec needs to be created before this snapshot can be executed. Error: VolumeSnapshotLocation.velero.io "default" not found.

I am creating a BackupStorageLocation in my helm chart:

configuration:
  features: EnableCSI
  uploaderType: kopia
  backupStorageLocation:
  - name: velero-backup-storage-location
    bucket: {{ .Values.velero_backups_bucket }}
    default: true
    provider: aws
    config:
      region: us-east-1
  volumeSnapshotLocation:
  - name: velero-volume-storage-location
    provider: aws
    config:
      region: us-east-1

this does not explain why manual schedules work and helm chart schedules don't. Manual schedules don't specify storageLocation or volumeSnapshotLocations. I have tied everything so I guess the question is what should the schedule chart config look like?

kaovilai · 2024-10-04T18:39:05Z

storageLocation: default
in your helm should probably be velero-backup-storage-location seen here.

darnone · 2024-10-04T19:23:56Z

I made the change you suggested and results with:

v get backup
NAME                                                 STATUS             ERRORS   WARNINGS   CREATED   EXPIRES   STORAGE LOCATION            SELECTOR
velero-gts-tools-ci-dev-schedule-vs-20241004191322   FailedValidation   0        0          <nil>     9d        ebs_volume_snapshot_class   <none>

v describe backup:

Validation errors:  an existing backup storage location wasn't specified at backup creation time and the default 'ebs_volume_snapshot_class' wasn't found. Please address this issue (see `velero backup-location -h` for options) and create a new backup. Error: BackupStorageLocation.velero.io "ebs_volume_snapshot_class" not found
                    a VolumeSnapshotLocation CRD for the location default with the name specified in the backup spec needs to be created before this snapshot can be executed. Error: VolumeSnapshotLocation.velero.io "default" not found

but my volume snapshot class is not ebs_volume_snapshot_class, it is ebs-volume-snapshot-class so where is that being set?

kaovilai · 2024-10-04T19:28:50Z

I don't think I know how to help without cluster access atm.

darnone · 2024-10-04T19:52:46Z

I really do appreciate your response. I found a typo in schedule definition.

- name: velero-backup-storage-location

That part of the failurevalidation is resolved. I removed volumeSnapshotLocations and it appears it does not need to present but it still failing - now with some kind of time out. I don't have these issues with manually created schedules. Here is the describe. You reply is appcreciated.

describe-backup.txt

darnone · 2024-10-04T20:14:17Z

I removed the empty labelSelector:{} because I found this: #2083. It is now failing because it
failed to get VolumeSnapshotClass for StorageClasses:

ensure that the desired VolumeSnapshot class has the velero.io/csi-volumesnapshot-class label

but they look configured correctly. The volume snapshot class has the label and the storage class has the annotation.

apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Retain
driver: ebs.csi.aws.co
kind: VolumeSnapshotClass
metadata:
  annotations:
    meta.helm.sh/release-name: ebs-volume-snapshot-class
    meta.helm.sh/release-namespace: default
  labels:
    app.kubernetes.io/instance: ebs-volume-snapshot-class
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: velero
    helm.sh/chart: velero-0.1.0
    velero.io/csi-volumesnapshot-class: "true"
  name: ebs-volume-snapshot-class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus-stack-sc
    meta.helm.sh/release-namespace: default
    storageclass.kubernetes.io/is-default-class: "true"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    app: kube-prometheus-stack-prometheus
    app.kubernetes.io/instance: kube-prometheus-stack-sc
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus-stack
    app.kubernetes.io/version: 1.16.0
    helm.sh/chart: kube-prometheus-stack-0.1.0
  name: kube-grafana-sc
provisioner: ebs.csi.aws.com
reclaimPolicy: Retain
volumeBindingMode: Immediate

What am I doing wrong?
describe-backup.txt

darnone · 2024-10-04T21:21:55Z

After working on this for over a week, I am beginning to think this implementation is broken but waiting for someone to prove me wrong. I don't know what else to try soI am grateful for any reply or ai am going to have to abandon this approach.

kaovilai · 2024-10-04T22:40:12Z

Can you get backup/schedule yaml?

kaovilai · 2024-10-04T22:48:30Z

Prob one of these errors

velero/pkg/backup/item_collector.go

Line 556 in 42de654

logger.WithError(err).Error("Error listing items")

darnone changed the title ~~Scheduled config from helm chart config fails v create backup --from shedule~~ Scheduled config from helm chart config fails backup with ValidationError Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduled config from helm chart config fails backup with ValidationError #8262

Scheduled config from helm chart config fails backup with ValidationError #8262

darnone commented Oct 3, 2024

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

darnone commented Oct 4, 2024

darnone commented Oct 4, 2024 •

edited

Loading

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

kaovilai commented Oct 4, 2024

Scheduled config from helm chart config fails backup with ValidationError #8262

Scheduled config from helm chart config fails backup with ValidationError #8262

Comments

darnone commented Oct 3, 2024

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

darnone commented Oct 4, 2024

darnone commented Oct 4, 2024 • edited Loading

darnone commented Oct 4, 2024

kaovilai commented Oct 4, 2024

kaovilai commented Oct 4, 2024

darnone commented Oct 4, 2024 •

edited

Loading