Local mode with Docker

Installation

To unleash the full power of Data Accelerator, deploy to Azure. We have also enabled running Data Accelerator locally, without any cloud dependencies, however, the features are very limited (no Live Query, Auto Schema inference, etc.). To run Data Accelerator locally, follow Local deployment steps below.

Local Deployment

Run Data Accelerator locally by downloading and running docker container. Even though the features are very limited compared to cloud mode, it gives you a cursory feel of the overall experience quickly.

Prerequisites:

docker (To get more info on this, see the FAQ).
Once docker is installed and running, update the docker Settings (Note if you run the docker with less resources, your experience may be degraded or processing may lag particularly around the sample Flow):
Right click on docker in the System Tray-->Settings-->Advanced-->CPU: 6 cores; Memory: at least 5 GB (5120 MB).
PowerShell (Windows has this by default, Linux users will have to install from this location). Mac users can use Terminal which is available by default.

Deployment

To start, run the below commands in Powershell on Windows (and approve subsequent elevation request) or in Terminal on Mac. (This will get the latest Data Accelerator image)
```
docker run --rm --name dataxlocal -d -p 127.0.0.1:49080:2020 -p 127.0.0.1:4040:4040 mcr.microsoft.com/datax/dataxlocal:v1	
```
- If you want to get the latest docker image, delete the one you have downloaded previously and then run the above command. To delete already downloaded image, follow these steps:
  - Run these commands (in case you haven't already done so):
```
docker stop dataxlocal            	
docker images -a	
```
  - This will list all the images on your box. Note the ImageId for all images listed where the repository equals mcr.microsoft.com/datax/dataxlocal and then run the following command for each of the ImageId to remove them from the machine:
```
docker image rm <ImageId>  	
```

Open the portal at: http://localhost:49080/home to start Data Accelerator and create your first Flow and / or checkout the samples
Check out step by step tutorials for local mode

Running a job

To try out the sample: Go to http://localhost:49080/config, select "BasicLocal" flow.
Make an edit (for example, go to Query tab and enter a space in the editor), then Click ‘Deploy’
Open the Metric tab and click on your Flow name. You should see your 2 default metrics which exist for all flows by default. Note: Currently running only 1 job at a time is supported for the local scenario. You can control which job to run, by clicking on “Jobs” tab and starting/stopping jobs to run.

Logs

To view Spark job logs for checking job execution or for diagnosing issues, run the following command
```
docker logs --tail 1000 dataxlocal	
```
To learn more, see the tutorial on logs

SSH into the docker container

Run the following command to view files in the docker container

```
docker exec -it dataxlocal /bin/bash	
```

Stopping the docker container and cleaning images

When finished with the container, run the following stop the container to free up used resources.
```
docker stop dataxlocal		
```
See the FAQ to learn more how to remove all the dangling images.

FAQ and troubleshooting:

Please refer to the FAQ.

Data Accelerator

Home

Install

Docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly