Airflow in Kubernetes Executor

If you want to play with Airflow + K8S executor, setting up your local system to start playing with an example takes a lot of time. This repo aims to solve that. With this repo you can install Airflow with K8S executor this repo provides a base template DAG which you can edit and use to your need.

How it works

When installing this setup will run a local registry for docker images so that minikube can pull the docker images from your local machine.

Once the setup is up and running all you have to do is build the image locally and tag them as described below

  docker build -t myapp .
  
  # Syntax => docker tag ${APP_NAME} localhost:5000/${APP_NAME}:${LABEL}
  # example 
  docker tag myapp localhost:5000/myapp:1

Pre-requisite

Download Docker Desktop from [here](https://www.docker.com/products/docker-desktop] and setup in your local machine)
Download Virtualbox from here and setup in your local machine

Install Minikube, Helm & add package repo

      brew update

      # install minikube
      brew install minikube
      
      # install helm
      brew install kubernetes-helm
      
      # add package repository to helm 
      helm repo add stable https://kubernetes-charts.storage.googleapis.com

How do I Build ?

# ALL .PY Files in dags/ folder will be loaded as DAGS

$ make run 
# yup! that's it 
# the command takes few minutes to start, so trust the process & wait.

#Once the Make command is done execute the following commnds in different terminals
$ kubectl get pods --watch # to monitor the pod health
$ minikube dashboard  # to get the K8S dashboard

# If this URL doesn't load wait for few mins until the K8S dashboard becomes healthy. (usually takes 6-10 minutes)
$ minikube service airflow-web -n airflow # to load the Airflow UI page 

# Once you are done with the services you can stop all the services using following command 
$ make cleanup

# Commands that might be of interest during development
$ make run # -> starts everything from ground up
$ make restart # -> delete the charts & cluster and starts everything from ground up again!
$ make redeploy # -> delete the charts & re-install helm (airflow) chart again
$ make dags # -> Copy the latest DAGS from local machine to airflow

Example Image provided

I have included a sample countries.csv file
The Idea is that we would create an example DAG that would have following functionality
- Task1: Take input file & execute apply_lower step
  - Converts all Country name to lower case
  - The output of this step is stored to S3 in run_id/apply_lower/output.csv
- Task2: Take output file from step 1 & execute apply_code step
  - For each country name available we calculate Country code using pycountry library
  - The output of this step is stored to S3 in run_id/apply_code/output.csv
- Task3: Take output file from step 2 & execute apply_currency step
  - For each country name available we make REST call and parse the currency code
  - The output of this step is stored to S3 in run_id/apply_currency/output.csv

This is not provided as a DAG, the code & Dockerfile is given so that you can start focusing on creating a docker image and play with creating KubernetesPodOperator on your own for this application

Steps to run the docker image locally

# To Build 
cd src 
docker build -t country . 

# To Run the application locally execute the following command 
$ docker run -v ~/.aws:/app/credentials -e S3_FILE_NAME="{BUCKET_NAME}/airflow-poc/run_001/input/countries.csv" -e TASK_NAME="apply_lower"  country
$ docker run -v ~/.aws:/app/credentials -e S3_FILE_NAME="{BUCKET_NAME}/airflow-poc/run_001/input/countries.csv" -e TASK_NAME="apply_code"  country
$ docker run -v ~/.aws:/app/credentials -e S3_FILE_NAME="{BUCKET_NAME}/airflow-poc/run_001/input/countries.csv" -e TASK_NAME="apply_currency"  country

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
bash_scripts		bash_scripts
dags		dags
docker_airflow_image/puckel/docker-airflow		docker_airflow_image/puckel/docker-airflow
helm_charts/official/charts		helm_charts/official/charts
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
reference_commands_used.sh		reference_commands_used.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Airflow in Kubernetes Executor

How it works

Pre-requisite

How do I Build ?

Example Image provided

Steps to run the docker image locally

About

Releases

Packages

Contributors 3

Languages

License

Sureya/airflow_k8s_executor

Folders and files

Latest commit

History

Repository files navigation

Airflow in Kubernetes Executor

How it works

Pre-requisite

How do I Build ?

Example Image provided

Steps to run the docker image locally

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages