Skip to content
This repository has been archived by the owner on May 22, 2021. It is now read-only.

Commit

Permalink
[AIRFLOW-5704] Improve Kind Kubernetes scripts for local testing (apa…
Browse files Browse the repository at this point in the history
…che#6516)

* Fixed problem that Kubernetes tests were testing latest master
  rather than what came from the local sources.
* Kind (Kubernetes in Dcocker) is run in the same Docker as Breeze env
* Moved Kubernetes scripts to 'in_container' dir where they belong now
* Kubernetes cluster is reused until it is stopped
* Kubernetes image is build from image in docker already + mounted sources
* Kubectl version name is corrected in the Dockerfile
* KUBERNETES_VERSION can now be used to select Kubernetes version
* Running kubernetes scripts is now easy in Breeze
* We can start/recreate/stop cluster using  --<ACTION>-kind-cluster
* Instructions on how to run Kubernetes tests are updated
* The old "bare" environment is replaced by --no-deps switch
  • Loading branch information
potiuk authored and galuszkak committed Mar 5, 2020
1 parent 33155e2 commit 8eb54a0
Show file tree
Hide file tree
Showing 39 changed files with 955 additions and 634 deletions.
9 changes: 3 additions & 6 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,14 @@
!.coveragerc
!.rat-excludes
!.flake8
!.dockerignore
!pylintrc
!pytest.ini
!CHANGELOG.txt
!Dockerfile
!LICENSE
!MANIFEST.in
!NOTICE
!CHANGELOG.txt
!.github

# Avoid triggering context change on README change (new companies using Airflow)
Expand All @@ -54,16 +56,11 @@
# Run tests command with bash completion
!.bash_completion
!.bash_completion.d
!run-tests
!run-tests-complete

# Setup/version configuration
!setup.cfg
!setup.py

# Entrypoint script
!scripts/docker/entrypoint.sh

# Now - ignore unnecessary files inside allowed directories
# This goes after the allowed directories

Expand Down
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,8 @@ rat-results.txt
# Kubernetes generated templated files
*.generated
*.tar.gz
scripts/ci/kubernetes/kube/.generated/airflow.yaml
scripts/ci/kubernetes/docker/requirements.txt
scripts/ci/in_container/kubernetes/kube/.generated/airflow.yaml
scripts/ci/in_container/kubernetes/docker/requirements.txt

# Node & Webpack Stuff
*.entry.js
Expand Down Expand Up @@ -173,5 +173,6 @@ dmypy.json
/hive_scratch_dir/
/.bash_aliases
/.bash_history
/.kube
/.inputrc
log.txt*
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ CHANGELOG.txt
.*lock
unittests.cfg
logs
.bash_aliases

# Generated doc files
.*html
Expand Down
43 changes: 20 additions & 23 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,50 +37,47 @@ jobs:
script: ./scripts/ci/ci_run_all_static_tests.sh
env: >-
PYTHON_VERSION=3.6
AIRFLOW_MOUNT_SOURCE_DIR_FOR_STATIC_CHECKS="true"
- name: "Build documentation"
env: >-
PYTHON_VERSION=3.6
stage: pre-test
script: ./scripts/ci/ci_docs.sh
- name: "Tests postgres kubernetes python 3.6 (persistent)"
env: >-
BACKEND=postgres
PYTHON_VERSION=3.6
ENABLE_KIND_CLUSTER=true
KUBERNETES_MODE=persistent_mode
KUBERNETES_VERSION=v1.15.3
python: "3.6"
stage: test
- name: "Tests postgres kubernetes python 3.6 (git)"
env: >-
BACKEND=postgres
PYTHON_VERSION=3.6
ENABLE_KIND_CLUSTER=true
KUBERNETES_MODE=git_mode
KUBERNETES_VERSION=v1.15.3
python: "3.6"
stage: test
- name: "Tests postgres python 3.6"
env: >-
BACKEND=postgres
ENV=docker
PYTHON_VERSION=3.6
stage: test
- name: "Tests sqlite python 3.6"
env:
BACKEND=sqlite
ENV=docker
PYTHON_VERSION=3.6
stage: test
- name: "Tests mysql python 3.7"
env:
BACKEND=mysql
ENV=docker
PYTHON_VERSION=3.7
stage: test
- name: "Tests postgres kubernetes python 3.6 (persistent)"
env: >-
BACKEND=postgres
ENV=kubernetes
START_KUBERNETES_CLUSTER=true
KUBERNETES_VERSION=v1.15.0
KUBERNETES_MODE=persistent_mode
PYTHON_VERSION=3.6
stage: test
script: travis_wait 30 "./scripts/ci/ci_run_airflow_testing.sh"
- name: "Tests postgres kubernetes python 3.6 (git)"
env: >-
BACKEND=postgres
ENV=kubernetes
KUBERNETES_VERSION=v1.15.0
KUBERNETES_MODE=git_mode
PYTHON_VERSION=3.6
stage: test
script: travis_wait 30 "./scripts/ci/ci_run_airflow_testing.sh"
services:
- docker
before_install:
- ./scripts/ci/ci_before_install.sh
script: "./scripts/ci/ci_run_airflow_testing.sh"
script: ./scripts/ci/ci_run_airflow_testing.sh
122 changes: 89 additions & 33 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ follows:

.. code-block:: bash
./breeze --python 3.6 --backend mysql --env docker
./breeze --python 3.6 --backend mysql
The choices you make are persisted in the ``./.build/`` cache directory so that next time when you use the
``breeze`` script, it could use the values that were used previously. This way you do not have to specify
Expand All @@ -229,25 +229,6 @@ default settings.

The defaults when you run the Breeze environment are Python 3.6, Sqlite, and Docker.

Available Docker Environments
..............................

You can choose a container environment when you run Breeze with ``--env`` flag.
Running the default ``docker`` environment takes a considerable amount of resources. You can run a
slimmed-down version of the environment - just the Apache Airflow container - by choosing ``bare``
environment instead.

The following environments are available:

* The ``docker`` environment (default): starts all dependencies required by a full integration test suite
(Postgres, Mysql, Celery, etc). This option is resource intensive so do not forget to
[stop environment](#stopping-the-environment) when you are finished. This option is also RAM intensive
and can slow down your machine.
* The ``kubernetes`` environment: Runs Airflow tests within a Kubernetes cluster.
* The ``bare`` environment: runs Airflow in the Docker without any external dependencies.
It only works for independent tests. You can only run it with the sqlite backend.


Cleaning Up the Environment
---------------------------

Expand Down Expand Up @@ -528,6 +509,65 @@ As soon as you enter the Breeze environment, you can run Airflow unit tests via

For supported CI test suites, types of unit tests, and other tests, see `TESTING.rst <TESTING.rst>`_.

Running Tests with Kubernetes in Breeze
=======================================

In order to run Kubernetes in Breeze you can start Breeze with ``--start-kind-cluster`` switch. This will
automatically create a Kind Kubernetes cluster in the same ``docker`` engine that is used to run Breeze
Setting up the Kubernetes cluster takes some time so the cluster continues running
until the cluster is stopped with ``--stop-kind-cluster`` switch or until ``--recreate-kind-cluster``
switch is used rather than ``--start-kind-cluster``.

The cluster name follows the pattern ``airflow-python-X.Y.Z-vA.B.C`` where X.Y.Z is Python version
and A.B.C is kubernetes version. This way you can have multiple clusters setup and running at the same
time for different python versions and different kubernetes versions.

The Control Plane is available from inside the docker image via ``<CLUSTER_NAME>-control-plane:6443``
host:port, the worker of the kind cluster is available at <CLUSTER_NAME>-worker
and webserver port for the worker is 30809.

The Kubernetes Cluster is started but in order to deploy airflow to Kubernetes cluster you need to:

1. Build the image.
2. Load it to Kubernetes cluster.
3. Deploy airflow application.

It can be done with single script: ``./scripts/ci/in_container/kubernetes/deploy_airflow_to_kubernetes.sh``

You can, however, work separately on the image in Kubernetes and deploying the Airflow app in the cluster.

Building Airflow Images and Loading them to Kubernetes cluster
--------------------------------------------------------------

This is done using ``./scripts/ci/in_container/kubernetes/docker/rebuild_airflow_image.sh`` script:

1. Latest ``apache/airflow:master-pythonX.Y-ci`` images are rebuilt using latest sources.
2. New Kubernetes image based on the ``apache/airflow:master-pythonX.Y-ci`` is built with
necessary scripts added to run in kubernetes. The image is tagged with
``apache/airflow:master-pythonX.Y-ci-kubernetes`` tag.
3. The image is loaded to the kind cluster using ``kind load`` command

Deploying Airflow Application in the Kubernetes cluster
-------------------------------------------------------

This is done using ``./scripts/ci/in_container/kubernetes/app/deploy_app.sh`` script:

1. Kubernetes resources are prepared by processing template from ``template`` directory, replacing
variables with the right images and locations:
- configmaps.yaml
- airflow.yaml
2. The existing resources are used without replacing any variables inside:
- secrets.yaml
- postgres.yaml
- volumes.yaml
3. All the resources are applied in the Kind cluster
4. The script will wait until all the applications are ready and reachable

After the deployment is finished you can run Kubernetes tests immediately in the same way as other tests.
The Kubernetes tests are in ``tests/integration/kubernetes`` folder.

You can run all the integration tests for Kubernetes with ``pytest tests/integration/kubernetes``.

Breeze Command-Line Interface Reference
=======================================

Expand Down Expand Up @@ -646,22 +686,35 @@ This is the current syntax for `./breeze <./breeze>`_:
Python version used for the image. This is always major/minor version.
One of [ 3.6 3.7 ]. Default is the python3 or python on the path.
-E, --env <ENVIRONMENT>
Environment to use for tests. It determines which types of tests can be run.
One of [ docker kubernetes ]. Default: docker
-B, --backend <BACKEND>
Backend to use for tests - it determines which database is used.
One of [ sqlite mysql postgres ]. Default: sqlite
-K, --kubernetes-version <KUBERNETES_VERSION>
Kubernetes version - only used in case of 'kubernetes' environment.
One of [ v1.13.0 ]. Default: v1.13.0
-K, --start-kind-cluster
Starts kind Kubernetes cluster after entering the environment. The cluster is started using
Kubernetes Mode selected and Kubernetes version specifed via --kubernetes-mode and
--kubernetes-version flags.
-Z, --recreate-kind-cluster
Recreates kind Kubernetes cluster if one has already been created. By default, if you do not stop
environment, the Kubernetes cluster created for testing is continuously running and when
you start Kubernetes testing again it will be reused. You can force deletion and recreation
of such cluster with this flag.
-X, --stop-kind-cluster
Stops kind Kubernetes cluster if one has already been created. By default, if you do not stop
environment, the Kubernetes cluster created for testing is continuously running and when
you start Kubernetes testing again it will be reused. You can force deletion and recreation
of such cluster with this flag.
-M, --kubernetes-mode <KUBERNETES_MODE>
Kubernetes mode - only used in case of 'kubernetes' environment.
Kubernetes mode - only used in case --start-kind-cluster flag is specified.
One of [ persistent_mode git_mode ]. Default: git_mode
-V, --kubernetes-version <KUBERNETES_VERSION>
Kubernetes version - only used in case --start-kind-cluster flag is specified.
One of [ v1.15.3 v1.16.2 ]. Default: v1.15.3
-s, --skip-mounting-source-volume
Skips mounting local volume with sources - you get exactly what is in the
docker image rather than your current local sources of airflow.
Expand Down Expand Up @@ -694,15 +747,19 @@ This is the current syntax for `./breeze <./breeze>`_:
automatically for the first time or when changes are detected in
package-related files, but you can force it using this flag.
-R, --force-build-images-clean
Force build images without cache. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
-p, --force-pull-images
Forces pulling of images from DockerHub before building to populate cache. The
images are pulled by default only for the first time you run the
environment, later the locally build images are used as cache.
-R, --force-clean-build
Force build images without cache at all. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
-L, --use-local-cache
Uses local cache to build images. No pulled images will be used, but results of local builds in
the Docker cache are used instead.
-u, --push-images
After building - uploads the images to DockerHub
It is useful in case you use your own DockerHub user to store images and you want
Expand All @@ -715,7 +772,6 @@ This is the current syntax for `./breeze <./breeze>`_:
.. END BREEZE HELP MARKER
Convenience Scripts
-------------------

Expand Down
16 changes: 12 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ RUN mkdir -pv /usr/share/man/man1 \
&& apt-get install --no-install-recommends -y \
gnupg \
apt-transport-https \
bash-completion \
ca-certificates \
software-properties-common \
krb5-user \
Expand Down Expand Up @@ -197,14 +198,14 @@ RUN curl -L https://download.docker.com/linux/debian/gpg | apt-key add - \
&& apt-get clean && rm -rf /var/lib/apt/lists/*

# Install kubectl
ARG KUBECTL_VERSION="v1.15.0"
ARG KUBECTL_VERSION="v1.15.3"

RUN KUBECTL_URL="https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/amd64/kubectl" \
&& curl -L "${KUBECTL_URL}" -o "/usr/local/bin/kubectl" \
&& chmod +x /usr/local/bin/kubectl

# Install Kind
ARG KIND_VERSION="v0.5.0"
ARG KIND_VERSION="v0.6.1"

RUN KIND_URL="https://github.com/kubernetes-sigs/kind/releases/download/${KIND_VERSION}/kind-linux-amd64" \
&& curl -L "${KIND_URL}" -o "/usr/local/bin/kind" \
Expand Down Expand Up @@ -361,7 +362,7 @@ COPY airflow/www/ ${AIRFLOW_SOURCES}/airflow/www/
# Package NPM for production
RUN yarn run prod

COPY ./scripts/docker/entrypoint.sh /entrypoint.sh
COPY scripts/docker/entrypoint.sh /entrypoint.sh

# Copy selected subdirectories only
COPY .github/ ${AIRFLOW_SOURCES}/.github/
Expand All @@ -377,9 +378,16 @@ COPY .coveragerc .rat-excludes .flake8 pylintrc LICENSE MANIFEST.in NOTICE CHANG
setup.cfg setup.py \
${AIRFLOW_SOURCES}/

# Intall autocomplete
# Needed for building images via docker-in-docker inside the docker
COPY Dockerfile ${AIRFLOW_SOURCES}/Dockerfile

# Install autocomplete for airflow
RUN register-python-argcomplete airflow >> ~/.bashrc

# Install autocomplete for Kubeclt
RUN echo "source /etc/bash_completion" >> ~/.bashrc \
&& kubectl completion bash >> ~/.bashrc

WORKDIR ${AIRFLOW_SOURCES}

# Additional python deps to install
Expand Down
Loading

0 comments on commit 8eb54a0

Please sign in to comment.