Skip to content
David edited this page Jan 11, 2018 · 11 revisions

note about pip

Restore Cassandra backups

Restore user files backups

Note about pip

If you have issues with pip not using ssl one can edit /usr/lib/python2.7/dist-packages/pip/commands/install.py and update this line to use https rather than just http; default='http://pypi.python.org/simple/',

Or in 1 command...

sed -i 's/http/https/g' /usr/lib/python2.7/dist-packages/pip/commands/install.py

Creating backups

Mostly automatic through puppet and the duplicity module

Restoring Cassandra backups

The following outlines how production database backups can be restored on the staging nodes.

What you will need:

  • The encryption key that was used to encrypt the backups. All key IDs in this doc are examples.
  • An AWS access/secret key that has READ/LIST privilege on the Amazon S3 bucket where the backups are stored.

On each Cassandra node;

  1. Install required restore tools, which are not included by default in staging.
# apt-get install python-pip duplicity
# pip install boto
  1. Import the private key.
# gpg --import key-for-backups.pvt
gpg: keyring `/root/.gnupg/secring.gpg' created
gpg: key 281CF39B: public key "Backup Key <[email protected]>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
  1. Verify if you have all the correct credentials.
# gpg --list-keys
/root/.gnupg/pubring.gpg
------------------------
pub   2048R/281CF39B 2013-06-22
uid                  Backup Key <[email protected]>
sub   2048R/D2E4C1E2 2013-06-22

# export AWS_ACCESS_KEY_ID=<aws key id>
# export AWS_SECRET_ACCESS_KEY=<aws secret access key>
# export encryptkey=<backup private key>
# duplicity --s3-use-new-style --encrypt-key=${encryptkey} list-current-files s3+http://oae-cassandra-backup/db0/cassandra
Import of duplicity.backends.sshbackend Failed: No module named paramiko
Import of duplicity.backends.giobackend Failed: No module named gio
Synchronizing remote metadata to local cache...
GnuPG passphrase: 
Copying duplicity-full-signatures.20140328T040001Z.sigtar.gpg to local cache.
Copying duplicity-full-signatures.20140428T040002Z.sigtar.gpg to local cache.
Copying duplicity-full.20140125T040002Z.manifest.gpg to local cache.
Copying duplicity-full.20140225T040002Z.manifest.gpg to local cache.
...
Last full backup date: Mon Apr 28 04:00:02 2014
Sat Nov 23 12:09:25 2013 .
Thu Jul 11 16:52:31 2013 OpsCenter
Thu May  8 19:46:04 2014 OpsCenter/events
Sat Mar 15 05:58:00 2014 OpsCenter/events/OpsCenter-events-ic-4654-CompressionInfo.db
Sat Mar 15 05:58:00 2014 OpsCenter/events/OpsCenter-events-ic-4654-Data.db
Sat Mar 15 05:58:00 2014 OpsCenter/events/OpsCenter-events-ic-4654-Filter.db
Sat Mar 15 05:58:00 2014 OpsCenter/events/OpsCenter-events-ic-4654-Index.db
...

If you get to to this point it means we're fully setup to pull down the data from S3.

  1. Stop Cassandra AND puppet agent on all the db nodes, otherwise the puppet agent will restart Cassandra.
# service puppet stop
# service dse stop
  1. Blow away all the data, commitlogs and saved caches:
# rm -rf /data/cassandra/data/* /var/lib/cassandra/*
  1. Restore the files e.g.: on db0
# export AWS_ACCESS_KEY_ID=<aws key id>
# export AWS_SECRET_ACCESS_KEY=<aws secret access key>
# export encryptkey=<backup private key>
# duplicity --s3-use-new-style --encrypt-key=${encryptkey} restore s3+http://oae-cassandra-backup/db0/cassandra /data/cassandra/data

Note: To interact with the restored staging with a browser you'll likely need to adjust nginx's config.

Quick list of commands to restore Cassandra;

apt-get install python-pip duplicity
pip install boto
gpg --import backup_secret_key
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export AWS_DEFAULT_REGION=
export encryptkey=
export dbhn=$(hostname)
duplicity --s3-use-new-style --encrypt-key=${encryptkey} list-current-files s3+http://oae-cassandra-backup/${dbhn}/cassandra
service puppet stop
service dse stop
rm -rf /data/cassandra/data/* /var/lib/cassandra/*
duplicity --s3-use-new-style --encrypt-key=${encryptkey} restore s3+http://oae-cassandra-backup/${dbhn}/cassandra /data/cassandra/data

Restoring userfiles backups

The following outlines how production user files backups can be restored on the staging nodes. This is a time consuming process so I suggest running the process in a screen session with a large scrollback buffer.

What you will need:

  • The encryption key that was used to encrypt the backups. All key IDs in this doc are examples.
  • An AWS access/secret key that has READ/LIST privilege on the Amazon S3 bucket where the backups are stored.
  1. Pick a server that has access to /shared and install required restore tools, which are not included by default in staging. Any app server will do.
# apt-get install python-pip duplicity
# pip install boto
  1. Import the private key.
# gpg --import key-for-backups.pvt
gpg: keyring `/root/.gnupg/secring.gpg' created
gpg: key 281CF39B: public key "Backup Key <[email protected]>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
  1. Verify if you have all the correct credentials.
# gpg --list-keys
/root/.gnupg/pubring.gpg
------------------------
pub   2048R/281CF39B 2013-06-22
uid                  Backup Key <[email protected]>
sub   2048R/D2E4C1E2 2013-06-22

# export AWS_ACCESS_KEY_ID=<aws key id>
# export AWS_SECRET_ACCESS_KEY=<aws secret access key>
# export encryptkey=<backup private key>
# duplicity --s3-use-new-style --encrypt-key=${encryptkey} list-current-files s3+http://userfiles-backup/userfiles

At this point it should list the files in the backup, if so you'll be good to restore.

  1. You may want/need to stop access to the environment while the restore runs.

  2. Blow away any old data:

# rm -r /shared/{files,assets,restore}
  1. Restore the files. The *Unity instance has /shared as a mount point and duplicity will refuse to restore to that folder direct. These instructions mv the data to the correct location afterwards, however a few symlinks would also work fine if you want data available while the restore runs. Note that --tempdir needs to be used unless you have a rather large /tmp.
# export AWS_ACCESS_KEY_ID=<aws key id>
# export AWS_SECRET_ACCESS_KEY=<aws secret access key>
# export encryptkey=<backup private key>
# duplicity --tempdir /data --s3-use-new-style --encrypt-key=${encryptkey} restore s3+http://userfiles-backup/userfiles /shared/restore
# mv /shared/restore/{assets,files} /shared/

Quick list of commands to restore user files;

# Note you likely should run this in a screen session with a large scrollback buffer.
apt-get install python-pip duplicity
pip install boto
gpg --import backup_secret_key
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export AWS_DEFAULT_REGION=
export encryptkey=
rm -r /shared/{files,assets,restore}
duplicity --tempdir /data --s3-use-new-style --encrypt-key=${encryptkey} restore s3+http://userfiles-backup/userfiles /shared/restore
mv /shared/restore/{files,assets} /shared/
Clone this wiki locally