Skip to content

Local Tutorial Outputs to disk

Vijay Upadya edited this page Oct 3, 2019 · 12 revisions

The output tab let's you configure outputs to route data to. Since we are in local mode, we will see how to output the data to the local file system. In the cloud mode (i.e. once Data Accelerator is deployed to Azure), you would be able to output data to various other sinks such as Azure blobs, CosmosDB, etc.

In this tutorial, you'll learn to:

  • Add an output location
  • Write data to this output location

Setting up an output

  • Open your Flow
  • Open the Output Tab to configure an output and select Add 'Local'
    New output
  • For Alias, input "myOutput"; this is how the output will be referred to throughout the Flow,
  • For folder, you can input '/myOutput'; this is the folder data will go to within the docker container
  • Format will be JSON
  • You can decide to use GZIP compression or none as well
    New output
  • Go back to the Query tab and input a new OUTPUT statement at the end to output to local filesystem:
--DataXQuery--
events = SELECT MAX(temperature) as maxTemp
	 FROM 
	 DataXProcessedInput;

maxTemperature = CreateMetric(events, maxTemp);

OUTPUT maxTemperature TO Metrics;
OUTPUT events TO myOutput;

Query window will look like below: New output

  • Click Deploy.

You have connected the Flow to a new output.

View output within a docker container

You can view files within a container by statrting a bash session inside the container. This is useful to view output in case you have that specified in your flow. You can cd into the folder you specified when adding a local output location, say, Local Folder URI (e.g. /app/aspnetcore/output)

  • If you wish to view data from output
    docker exec -it dataxlocal /bin/bash
    
  • View the contents of a folder
    ls
    
  • Navigate to the specific output folder you configured. Inside it, you will notice sub folder of the form YYYY/MM/dd/hh/mm/batch-interval (UTC time) and the data is stored inside the time subfolder and contains a single file per output. Example: cd /myOutput/2019/04/23/45/234800
    cd <folder name>
    
  • View the contents of a file. Example: cat part-0.json
    cat <filename>
    

Other Links

Data Accelerator

Install

Docs

Clone this wiki locally