Skip to content

Migration into AppScale, Backup & Recovery

Chris Donati edited this page Dec 15, 2017 · 7 revisions

This document describes how to use the bulkloader for AppScale. This includes being able to download data from Google App Engine and moving data into AppScale, and vice versa. This feature has been supported since AppScale 1.6.5. To download and upload data in GAE see here.

Here you can also find instructions on how to use AppScale's Backup & Recovery native feature, supported since AppScale 2.2.0.

Using the bulkloader

Application requirements

Make sure that you have the remote_api builtins enabled in your app.yaml. Here is an example for the guestbook sample application:

application: guestbook
version: 1
runtime: python
api_version: 1
 
builtins:
- remote_api: on
 
handlers:
- url: /.*
  script: guestbook.py

If you are running a java application then you need to update your web.xml to contain the following in your web-app xml tags:

     <servlet>
        <display-name>Remote API Servlet</display-name>
        <servlet-name>RemoteApiServlet</servlet-name>
        <servlet-class>com.google.apphosting.utils.remoteapi.RemoteApiServlet</servlet-class>
        <load-on-startup>1</load-on-startup>
    </servlet>
    <servlet-mapping>
        <servlet-name>RemoteApiServlet</servlet-name>
        <url-pattern>/_ah/remote_api</url-pattern>
    </servlet-mapping>

Download Data

To download entities of a specific kind from AppScale run the following command (either on a VM that has AppScale installed on it, or checkout the code on your local machine):

 $> $APPSCALE_HOME/AppServer/appcfg.py download_data 
                  --filename <output_filename> 
                  --url=<ip>:<port>/_ah/remote_api 
                  --application <app_id>
                  --kind <kind>
                  --auth_domain appscale
                  --batch_size <max_batch_size>

Flags explained

filename: The name of the file to which the data will be placed. If the file already exists the download will fail.

url: The remote_api path of your application. Go to your application first to find out what IP/port you're using.

application: Your application identifier.

kind: The datastore model kind you want to download.

auth_domain: Tells the downloader you're using AppScale authentication.

batch_size: The number of entities you download per request. The smaller your entities are the more you can download per batch. If your entities are really big (~1MB), you should only be doing 1 per batch.

Example

For the guestbook sample application:

 $> $APPSCALE_HOME/AppServer/appcfg.py download_data 
                  --filename guestbook_Greeting.dat
                  --url=http://192.168.100.1:8080/_ah/remote_api 
                  --application guestbook
                  --kind Greeting
                  --auth_domain appscale
                  --batch_size 10

Here we are downloading the Greeting kind from our guestbook application. The url points to the remote_api path. In this example guestbook was the first application that was uploaded to AppScale so it's on port 8080 (subsequent apps increment this port by one). We are downloading in batches of 10 entities.

Download from GAE

To download entities from Google App Engine just omit the --auth_domain parameter like so:

 $> $APPSCALE_HOME/AppServer/appcfg.py download_data 
                  --filename <output_filename> 
                  --url=<app_URL>/_ah/remote_api 
                  --application <app_id>
                  --kind <kind>
                  --batch_size <max_batch_size>

If you have an app with High Replication then you should put 's~' as a prefix to your app ID.

Generating Kind Statistics

As of AppScale 1.7.0 you no longer have to specify the kind. The caveat is that statistics must be generated first. AppScale does this generation every 24 hours, or you can force a tabulation by ssh'ing into your head node and do the following:

cd /root/appscale/AppDB
python groomer.py

Upload Data

To upload data to your AppScale application:

$> $APPSCALE_HOME/AppServer/appcfg.py upload_data 
                  --filename <upload_file>
                  --url=<ip>:<port>/_ah/remote_api 
                  --application <app_id>
                  --auth_domain appscale

Flags explained

filename: The name of the file where the data we want to upload.

url: The remote_api path of your application. Go to your application first to find out what IP/port you're using.

application: Your application identifier.

auth_domain: Tells the uploader you're using AppScale authentication.

Example

For the sample guestbook application using the download file from the previous example to another deployment of AppScale:


$> $APPSCALE_HOME/AppServer/appcfg.py upload_data 
                  --filename guestbook_Greeting.dat
                  --url=http://192.168.55.200:8080/_ah/remote_api 
                  --application guestbook
                  --kind Greeting
                  --auth_domain appscale

Using Backup & Recovery tools

Doing a backup on AppScale

You can take a backup of the source code and data of an application deployed on AppScale by running the following command:

$> appscale-backup-data -a APP_ID --source-code

You can see other options supported by running:

$> appscale-backup-data --help

We strongly advise you to put your application to read only mode before taking a backup to prevent inconsistencies.

Doing a restore on AppScale

You can restore data under the same or a different application already deployed on AppScale by running the following command:

$> appscale-restore-data -a APP_ID --backup-dir BACKUP_DIR

You can see other options supported by running:

$> appscale-restore-data --help

For a clean restore use the -c flag to delete existing entities for that app ID.

Please report any issues you have to our issues page or tell someone on our IRC channel at #appscale on freenode.net.

Clone this wiki locally