Skip to content

Latest commit



334 lines (254 loc) · 9.22 KB

File metadata and controls

334 lines (254 loc) · 9.22 KB

Interval-based Volume Snapshots and Expiry on Kubernetes

What you do: Create a custom SnapshotRule resource which defines your desired snapshot intervals. What I do: Create snapshots of your volumes, and expire old ones using a Grandfather-father-son backup scheme.

Supported Environments:

  • Google Compute Engine disks.
  • AWS EBS disks.
  • Digital Ocean.

Want to help adding support for other backends? It's pretty straightforward. Have a look at the API that backends need to implement.


A persistent volume claim:

cat <<EOF | kubectl apply -f -
apiVersion: ""
kind: SnapshotRule
  name: postgres
  deltas: P1D P30D
  persistentVolumeClaim: postgres-data

A specific AWS EC2 volume:

cat <<EOF | kubectl apply -f -
apiVersion: ""
kind: SnapshotRule
  name: mysql
  deltas: P1D P30D
  backend: aws
     region: eu-west-1
     volumeId: vol-0aa6f44aad0daf9f2

You can also use an annotation instead of the CRDs:

kubectl patch pv pvc-01f74065-8fe9-11e6-abdd-42010af00148 -p \
  '{"metadata": {"annotations": {"": "P1D P30D P360D"}}}'


How to enable backups

To backup a volume, you can create a SnapshotRule custom resource. See more on this in the section further doiwn below.

Alternatively, you can add an annotation with the name to either your PersistentVolume or PersistentVolumeClaim resources.

Since PersistentVolumes are often created automatically for you by Kubernetes, you may want to annotate the volume claim in your resource definition file. Alternatively, you can kubectl edit pv a PersistentVolume created by Kubernetes and add the annotation.

The value of the annotation are a set of deltas that define how often a snapshot is created, and how many snapshots should be kept. See the section above for more information on how deltas work.

In the end, your annotation may look like this: PT1H P2D P30D P180D

There is also the option of manually specifying the volume names to be backed up as options to the k8s-snapshots daemon. See below for more information.

How the deltas work

The expiry logic of tarsnapper is used.

The generations are defined by a list of deltas formatted as ISO 8601 durations (this differs from tarsnapper). PT60S or PT1M means a minute, PT12H or P0.5D is half a day, P1W or P7D is a week. The number of backups in each generation is implied by it's and the parent generation's delta.

For example, given the deltas PT1H P1D P7D, the first generation will consist of 24 backups each one hour older than the previous (or the closest approximation possible given the available backups), the second generation of 7 backups each one day older than the previous, and backups older than 7 days will be discarded for good.

If the daemon is not running for a while, it will still try to approximate your desired snapshot scheme as closely as possible.

The most recent backup is always kept.

The first delta is the backup interval.


k8s-snapshots needs access to your Kubernetes cluster resources (to read the desired snapshot configuration) and access to your cloud infrastructure (to make snapshots).

Depending on your environment, it may be able to configure itself. Or, you might need to provide some configuration options.

Use the example deployment file given below to start off.

cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: Deployment
  name: k8s-snapshots
  namespace: kube-system
  replicas: 1
      app: k8s-snapshots
        app: k8s-snapshots
      - name: k8s-snapshots
        image: elsdoerfer/k8s-snapshots:latest

1. Based on your cluster.

See the docs/ folder for platform-specific instructions.

2. For Role-based Access Control (RBAC) enabled clusters

In Kubernetes clusters with RBAC, the required permissions need to be provided to the k8s-snapshots pods to watch and list persistentvolume or persistentvolumeclaims. We provide a manifest to setup a ServiceAccount with a minimal set of permissions in rbac.yaml.

kubectl apply -f manifests/rbac.yaml

Furthermore, under GKE, "Because of the way Container Engine checks permissions when you create a Role or ClusterRole, you must first create a RoleBinding that grants you all of the permissions included in the role you want to create."

If the above kubectl apply command produces an error about "attempt to grant extra privileges", the following will grant your user the necessary privileges first, so that you can then bind them to the service account:

  kubectl create clusterrolebinding your-user-cluster-admin-binding --clusterrole=cluster-admin [email protected]

Finally, adjust the deployment by adding serviceAccountName: k8s-snapshots to the spec (else you'll end up using the "default" service account), as follows:

     serviceAccountName: k8s-snapshots
      - name: k8s-snapshots
        image: elsdoerfer/k8s-snapshots:v2.0

Further Configuration Options

Pinging a third party service

PING_URL We'll send a GET request to this url whenever a backup completes. This is useful for integrating with monitoring services like Cronitor or Dead Man's Snitch.

Make snapshot names more readable

If your persistent volumes are auto-provisioned by Kubernetes, then you'll end up with snapshot names such as pv-pvc-01f74065-8fe9-11e6-abdd-42010af00148. If you want that prettier, set the enviroment variable USE_CLAIM_NAME=true. Instead of the auto-generated name of the persistent volume, k8s-snapshots will instead use the name that you give to your PersistentVolumeClaim.

SnapshotRule resources

It's possible to ask k8s-snapshots to create snapshots of volumes for which no PersistentVolume object exists within the Kubernetes cluster. For example, you might have a volume at your Cloud provider that you use within Kubernetes by referencing it directly.

To do this, we use a custom Kubernetes resource, SnapshotRule.

First, you need to create this custom resource.

On Kubernetes 1.7 and higher:

cat <<EOF | kubectl create -f -
kind: CustomResourceDefinition
  version: v1
  scope: Namespaced
    plural: snapshotrules
    singular: snapshotrule
    kind: SnapshotRule
    - sr

Or on Kubernetes 1.6 and lower:

cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: ThirdPartyResource
description: "Defines snapshot management rules for a disk."
- name: v1

You can then create SnapshotRule resources:

cat <<EOF | kubectl apply -f -
apiVersion: ""
kind: SnapshotRule
  name: mysql
  deltas: P1D P30D
  backend: aws
     region: eu-west-1
     volumeId: vol-0aa6f44aad0daf9f2

This is an example for backing up an EBS disk on the Amazon cloud. The disk option requires different keys, depending on the backend. See the examples folder.

You may also point SnapshotRule resources to PersistentVolumes (or PersistentVolumeClaims). This is intended as an alternative to adding an annotation; it may be desirable for some to separate the snapshot functionality from the resource.

cat <<EOF | kubectl apply -f -
apiVersion: ""
kind: SnapshotRule
  name: mysql
  deltas: P1D P30D
  persistentVolumeClaim: datadir-mysql

Backing up the etcd volumes of a kops cluster

After setting up the custom resource definitions (see previous section), use snapshot rules as defined in the examples/backup-kops-etcd.yml file. Reference the volume ids of your etcd volumes.

Other environment variables

LOG_LEVEL **Default: INFO**. Possible values: DEBUG, INFO, WARNING, ERROR
JSON_LOG **Default: False**. Output the log messages as JSON objects for easier processing.
TZ **Default: UTC**. Used to change the timezone. ie. TZ=America/Montreal


What if I manually create snapshots for the same volumes that k8s-snapshots manages?

Starting with v0.3, when k8s-snapshots decides when to create the next snapshot, and which snapshots it deletes, it no longer considers snapshots that are not correctly labeled by it.