Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled config from helm chart config fails backup with ValidationError #8262

Open
darnone opened this issue Oct 3, 2024 · 9 comments
Open

Comments

@darnone
Copy link

darnone commented Oct 3, 2024

What steps did you take and what happened:

Configuring schedules from the velero helm chart:

schedules:
  Football:
    disabled: false
    schedule: " 15 2 * * *"
    template:
      ttl: "240h"
      storageLocation: default
      volumeSnapshotLocations:
      - default
      includedNamespaces:
      - "*"
      includedResources:
      - "*"
      #excludedResources: {}
      # - events
      # - events.events.k8s.io
      # - pods
      # - replicasets.apps
      # - jobs
      labelSelector: {}
      snapshotVolumes: true
      hooks: {}

Great a backup from this schedule gives a message and the backup fails;

v create backup --from schedule velero-foobar
INFO[0002] No Schedule.template.metadata.labels set - using Schedule.labels for backup object  backup=velero/velero-foobar-20241003215259 labels="map[app.kubernetes.io/instance:velero app.kubernetes.io/managed-by:Helm app.kubernetes.io/name:velero helm.sh/chart:velero-7.2.1]"
Creating backup from schedule, all other filters are ignored.
Backup request "velero-foobar-20241003215259" submitted successfully.
Run `velero backup describe velero-foobar-20241003215259` or `velero backup logs velero-foobar-20241003215259` for more details.

V get backup
NAME                                               STATUS             ERRORS   WARNINGS   CREATED   EXPIRES   STORAGE LOCATION   SELECTOR
velero-foobar-20241003215259   FailedValidation   0        0          <nil>     9d        default            <none>

But if I create the schedule manually and create a backup from that, I will still get the message gut the backup succeeds.

What did you expect to happen:
I would expect backups from helm chart schedule to run successfully and with no info message

The following information will help us better understand what's going on:
We are using aws eks, k8sv 1.28, velero v 1.14.1 velero chart version 7.2.1

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

The backup name is different but it is the same result regardless
bundle-2024-10-03-18-08-09.tar.gz

Anything else you would like to add:

Environment:

  • Velero version (use velero version): 1.14.1
  • Velero features (use velero client config get features): Not Set but I am using features: EnableCSI
  • Kubernetes version (use kubectl version): 1.28`
  • Kubernetes installer & version: AWS EKS
  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Amazon Linux

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@darnone
Copy link
Author

darnone commented Oct 4, 2024

I could really use some help with this. I finally got describe on the failed backup.

Validation errors: an existing backup storage location wasn't specified at backup creation time and the default 'default' wasn't found. Please address this issue (see velero backup-location -h for options) and create a new backup. Error: a VolumeSnapshotLocation CRD for the location default with the name specified in the backup spec needs to be created before this snapshot can be executed. Error: VolumeSnapshotLocation.velero.io "default" not found.

I am creating a BackupStorageLocation in my helm chart:

configuration:
  features: EnableCSI
  uploaderType: kopia
  backupStorageLocation:
  - name: velero-backup-storage-location
    bucket: {{ .Values.velero_backups_bucket }}
    default: true
    provider: aws
    config:
      region: us-east-1
  volumeSnapshotLocation:
  - name: velero-volume-storage-location
    provider: aws
    config:
      region: us-east-1

this does not explain why manual schedules work and helm chart schedules don't. Manual schedules don't specify storageLocation or volumeSnapshotLocations. I have tied everything so I guess the question is what should the schedule chart config look like?

@kaovilai
Copy link
Contributor

kaovilai commented Oct 4, 2024

storageLocation: default
in your helm should probably be velero-backup-storage-location seen here.

@darnone darnone changed the title Scheduled config from helm chart config fails v create backup --from shedule Scheduled config from helm chart config fails backup with ValidationError Oct 4, 2024
@darnone
Copy link
Author

darnone commented Oct 4, 2024

I made the change you suggested and results with:

v get backup
NAME                                                 STATUS             ERRORS   WARNINGS   CREATED   EXPIRES   STORAGE LOCATION            SELECTOR
velero-gts-tools-ci-dev-schedule-vs-20241004191322   FailedValidation   0        0          <nil>     9d        ebs_volume_snapshot_class   <none>
v describe backup:

Validation errors:  an existing backup storage location wasn't specified at backup creation time and the default 'ebs_volume_snapshot_class' wasn't found. Please address this issue (see `velero backup-location -h` for options) and create a new backup. Error: BackupStorageLocation.velero.io "ebs_volume_snapshot_class" not found
                    a VolumeSnapshotLocation CRD for the location default with the name specified in the backup spec needs to be created before this snapshot can be executed. Error: VolumeSnapshotLocation.velero.io "default" not found

but my volume snapshot class is not ebs_volume_snapshot_class, it is ebs-volume-snapshot-class so where is that being set?

@kaovilai
Copy link
Contributor

kaovilai commented Oct 4, 2024

I don't think I know how to help without cluster access atm.

@darnone
Copy link
Author

darnone commented Oct 4, 2024

I really do appreciate your response. I found a typo in schedule definition.

- name: velero-backup-storage-location

That part of the failurevalidation is resolved. I removed volumeSnapshotLocations and it appears it does not need to present but it still failing - now with some kind of time out. I don't have these issues with manually created schedules. Here is the describe. You reply is appcreciated.

describe-backup.txt

@darnone
Copy link
Author

darnone commented Oct 4, 2024

I removed the empty labelSelector:{} because I found this: #2083. It is now failing because it
failed to get VolumeSnapshotClass for StorageClasses:

ensure that the desired VolumeSnapshot class has the velero.io/csi-volumesnapshot-class label

but they look configured correctly. The volume snapshot class has the label and the storage class has the annotation.

apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Retain
driver: ebs.csi.aws.co
kind: VolumeSnapshotClass
metadata:
  annotations:
    meta.helm.sh/release-name: ebs-volume-snapshot-class
    meta.helm.sh/release-namespace: default
  labels:
    app.kubernetes.io/instance: ebs-volume-snapshot-class
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: velero
    helm.sh/chart: velero-0.1.0
    velero.io/csi-volumesnapshot-class: "true"
  name: ebs-volume-snapshot-class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus-stack-sc
    meta.helm.sh/release-namespace: default
    storageclass.kubernetes.io/is-default-class: "true"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    app: kube-prometheus-stack-prometheus
    app.kubernetes.io/instance: kube-prometheus-stack-sc
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-prometheus-stack
    app.kubernetes.io/version: 1.16.0
    helm.sh/chart: kube-prometheus-stack-0.1.0
  name: kube-grafana-sc
provisioner: ebs.csi.aws.com
reclaimPolicy: Retain
volumeBindingMode: Immediate

What am I doing wrong?
describe-backup.txt

@darnone
Copy link
Author

darnone commented Oct 4, 2024

After working on this for over a week, I am beginning to think this implementation is broken but waiting for someone to prove me wrong. I don't know what else to try soI am grateful for any reply or ai am going to have to abandon this approach.

@kaovilai
Copy link
Contributor

kaovilai commented Oct 4, 2024

Can you get backup/schedule yaml?

@kaovilai
Copy link
Contributor

kaovilai commented Oct 4, 2024

Prob one of these errors

logger.WithError(err).Error("Error listing items")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants