Skip to content

Commit

Permalink
Add support for ARM platform (#22127)
Browse files Browse the repository at this point in the history
This support is mostly for the developers, not for CI full chain yet.
It has several limitations:

* no MySQL client support
* no MsSQL client support
* no CI tests yet

What is implemented:

* automated detection of ARM/AMD architecture when building and
  running breeze
* automated cache refresh on CI for ARM/AMD

Currently only development (ghcr.io) images are supported for ARM.

Fixes: #18849
Fixes: #17494
Relates to: #15635

The images published in DockerHub for now are AMD64 only. We will
run development with M1 images for some time and later we will
likely make our DockerHub images multi-platform as well.

Also Hadolint does not have ARM images yet so we had to disable it
and we should re-enable it back after the support is added.
See hadolint/hadolint#411
  • Loading branch information
potiuk authored Mar 10, 2022
1 parent 161fcbf commit 828d1cb
Show file tree
Hide file tree
Showing 32 changed files with 179 additions and 112 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1366,7 +1366,7 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
push-buildx-cache-to-github-registry:
permissions:
packages: write
timeout-minutes: 40
timeout-minutes: 120
name: "Push images as cache to GitHub Registry"
runs-on: ${{ fromJson(needs.build-info.outputs.runsOn) }}
needs:
Expand All @@ -1383,6 +1383,9 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
env:
RUNS_ON: ${{ fromJson(needs.build-info.outputs.runsOn) }}
PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
# Build cache for both platforms for development even if we are releasing
# PROD images only for amd64
PLATFORM: "linux/amd64,linux/arm64"
# Rebuild images before push using the latest constraints (just pushed) without
# eager upgrade. Do not wait for images, but rebuild them
UPGRADE_TO_NEWER_DEPENDENCIES: "false"
Expand Down
16 changes: 8 additions & 8 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1277,7 +1277,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1492,7 +1492,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1567,7 +1567,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1649,7 +1649,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1700,7 +1700,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1910,7 +1910,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -1994,7 +1994,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down Expand Up @@ -2409,7 +2409,7 @@ This is the current syntax for `./breeze <./breeze>`_:
One of:
linux/amd64
linux/amd64 linux/arm64 linux/amd64,linux/arm64
-d, --debian DEBIAN_VERSION
Expand Down
8 changes: 0 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -482,16 +482,8 @@ ARG AIRFLOW_VERSION
# See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation
# to learn more about the way how signals are handled by the image
# Also set airflow as nice PROMPT message.
# LD_PRELOAD is to workaround https://github.com/apache/airflow/issues/17546
# issue with /usr/lib/x86_64-linux-gnu/libstdc++.so.6: cannot allocate memory in static TLS block
# We do not yet a more "correct" solution to the problem but in order to avoid raising new issues
# by users of the prod image, we implement the workaround now.
# The side effect of this is slightly (in the range of 100s of milliseconds) slower load for any
# binary started and a little memory used for Heap allocated by initialization of libstdc++
# This overhead is not happening for binaries that already link dynamically libstdc++
ENV DUMB_INIT_SETSID="1" \
PS1="(airflow)" \
LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libstdc++.so.6" \
AIRFLOW_VERSION=${AIRFLOW_VERSION} \
AIRFLOW__CORE__LOAD_EXAMPLES="false" \
PIP_USER="true"
Expand Down
14 changes: 3 additions & 11 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,9 @@ ARG RUNTIME_APT_DEPS="\
ARG HELM_VERSION="v3.6.3"

RUN SYSTEM=$(uname -s | tr '[:upper:]' '[:lower:]') \
&& HELM_URL="https://get.helm.sh/helm-${HELM_VERSION}-${SYSTEM}-amd64.tar.gz" \
&& curl --silent --location "${HELM_URL}" | tar -xz -O "${SYSTEM}"-amd64/helm > /usr/local/bin/helm \
&& PLATFORM=$([ "$(uname -m)" = "aarch64" ] && echo "arm64" || echo "amd64" ) \
&& HELM_URL="https://get.helm.sh/helm-${HELM_VERSION}-${SYSTEM}-${PLATFORM}.tar.gz" \
&& curl --silent --location "${HELM_URL}" | tar -xz -O "${SYSTEM}-${PLATFORM}/helm" > /usr/local/bin/helm \
&& chmod +x /usr/local/bin/helm

ARG ADDITIONAL_RUNTIME_APT_DEPS=""
Expand Down Expand Up @@ -370,15 +371,6 @@ ENV PATH="/files/bin/:/opt/airflow/scripts/in_container/bin/:${PATH}" \
BUILD_ID=${BUILD_ID} \
COMMIT_SHA=${COMMIT_SHA}

# This one is to workaround https://github.com/apache/airflow/issues/17546
# issue with /usr/lib/x86_64-linux-gnu/libstdc++.so.6: cannot allocate memory in static TLS block
# We do not yet a more "correct" solution to the problem but in order to avoid raising new issues
# by users of the prod image, we implement the workaround now.
# The side effect of this is slightly (in the range of 100s of milliseconds) slower load for any
# binary started and a little memory used for Heap allocated by initialization of libstdc++
# This overhead is not happening for binaries that already link dynamically libstdc++
ENV LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libstdc++.so.6"

# Link dumb-init for backwards compatibility (so that older images also work)
RUN ln -sf /usr/bin/dumb-init /usr/local/bin/dumb-init

Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,11 +88,14 @@ Apache Airflow is tested with:
| | Main version (dev) | Stable version (2.2.4) |
|---------------------|---------------------|--------------------------|
| Python | 3.7, 3.8, 3.9 | 3.6, 3.7, 3.8, 3.9 |
| Platform | AMD64/ARM64(\*) | AMD64 |
| Kubernetes | 1.20, 1.21 | 1.18, 1.19, 1.20 |
| PostgreSQL | 10, 11, 12, 13 | 9.6, 10, 11, 12, 13 |
| MySQL | 5.7, 8 | 5.7, 8 |
| SQLite | 3.15.0+ | 3.15.0+ |
| MSSQL(Experimental) | 2017, 2019 | |
| MSSQL | 2017(\*), 2019 (\*) | |

\* Experimental

**Note**: MySQL 5.x versions are unable to or have limitations with
running multiple schedulers -- please see the [Scheduler docs](https://airflow.apache.org/docs/apache-airflow/stable/scheduler.html).
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/google/cloud/hooks/cloud_sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,7 +451,7 @@ def _download_sql_proxy_if_needed(self) -> None:
self.log.info("cloud-sql-proxy is already present")
return
system = platform.system().lower()
processor = "amd64" if CloudSqlProxyRunner._is_os_64bit() else "386"
processor = os.uname().machine
if not self.sql_proxy_version:
download_url = CLOUD_SQL_PROXY_DOWNLOAD_URL.format(system, processor)
else:
Expand Down
7 changes: 6 additions & 1 deletion breeze
Original file line number Diff line number Diff line change
Expand Up @@ -3389,7 +3389,6 @@ function breeze::run_build_command() {
build_images::prepare_prod_build
build_images::build_prod_images
else

build_images::prepare_ci_build
build_images::rebuild_ci_image_if_needed
fi
Expand Down Expand Up @@ -3500,6 +3499,12 @@ function breeze::run_breeze_command() {
case "${command_to_run}" in
enter_breeze)
docker_engine_resources::check_all_resources
if [[ $(uname -m) == "arm64" || $(uname -m) == "aarch64" ]]; then
if [[ ${BACKEND} == "mysql" || ${BACKEND} == "mssql" ]]; then
echo "${COLOR_RED}MacOS with ARM processor is not supported for ${BACKEND} backend. Exiting.${COLOR_RESET}"
exit 1
fi
fi
if [[ ${PRODUCTION_IMAGE} == "true" ]]; then
echo "${COLOR_RED}ERROR: Entering production image via breeze is not supported${COLOR_RESET}"
echo
Expand Down
2 changes: 1 addition & 1 deletion breeze-complete
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ _breeze_allowed_executors="KubernetesExecutor CeleryExecutor LocalExecutor Celer
_breeze_allowed_test_types="All Always Core Providers API CLI Integration Other WWW Postgres MySQL Helm Quarantined"
_breeze_allowed_package_formats="both sdist wheel"
_breeze_allowed_installation_methods=". apache-airflow"
_breeze_allowed_platforms="linux/amd64"
_breeze_allowed_platforms="linux/amd64 linux/arm64 linux/amd64,linux/arm64"

# shellcheck disable=SC2034
{
Expand Down
7 changes: 4 additions & 3 deletions chart/dockerfiles/pgbouncer-exporter/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ WORKDIR /usr/src/myapp

SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]

RUN URL="https://github.com/jbub/pgbouncer_exporter/archive/v${PGBOUNCER_EXPORTER_VERSION}.tar.gz" && \
curl -L "${URL}" | tar -zx --strip-components 1 && \
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -v
RUN URL="https://github.com/jbub/pgbouncer_exporter/archive/v${PGBOUNCER_EXPORTER_VERSION}.tar.gz" \
&& curl -L "${URL}" | tar -zx --strip-components 1 \
&& PLATFORM=$([ "$(uname -m)" = "aarch64" ] && echo "arm64" || echo "amd64" )\
&& GOOS=linux GOARCH="${PLATFORM}" CGO_ENABLED=0 go build -v

FROM alpine:${ALPINE_VERSION} AS final

Expand Down
11 changes: 10 additions & 1 deletion dev/REFRESHING_CI_CACHE.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,16 @@ git push
# Manually refreshing the images

Note that in order to refresh images you have to not only have `buildx` command installed for docker,
but you should also make sure that you have the buildkit builder configured and set.
but you should also make sure that you have the buildkit builder configured and set. Since we also build
multi-platform images (for both AMD and ARM), you need to have support for qemu installed with appropriate
flags.

According to the [official installation instructions](https://docs.docker.com/buildx/working-with-buildx/#build-multi-platform-images)
this can be achieved via:

```shell
docker run --privileged --rm tonistiigi/binfmt --install all
```

More information can be found [here](https://docs.docker.com/engine/reference/commandline/buildx_create/)

Expand Down
4 changes: 2 additions & 2 deletions dev/breeze/src/airflow_breeze/ci/build_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
from dataclasses import dataclass
from datetime import datetime
from typing import List, Optional
Expand Down Expand Up @@ -56,7 +56,7 @@ class BuildParams:
additional_runtime_apt_command: str = ""
additional_runtime_apt_deps: str = ""
additional_runtime_apt_env: str = ""
platform: str = "linux/amd64"
platform: str = f"linux/{os.uname().machine}"
debian_version: str = "bullseye"
upgrade_to_newer_dependencies: str = "true"

Expand Down
3 changes: 2 additions & 1 deletion dev/breeze/src/airflow_breeze/global_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import os
from pathlib import Path
from typing import List

Expand Down Expand Up @@ -197,7 +198,7 @@ def get_available_packages() -> List[str]:


# Initialise base variables
DOCKER_DEFAULT_PLATFORM = "linux/amd64"
DOCKER_DEFAULT_PLATFORM = f"linux/{os.uname().machine}"
DOCKER_BUILDKIT = 1

SSH_PORT = "12322"
Expand Down
4 changes: 2 additions & 2 deletions dev/refresh_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,5 @@ fi

python_version=$1

./breeze prepare-build-cache --python "${python_version}" --verbose
./breeze prepare-build-cache --python "${python_version}" --production-image --verbose
./breeze prepare-build-cache --python "${python_version}" --platform linux/amd64,linux/arm64 --verbose
./breeze prepare-build-cache --python "${python_version}" --platform linux/amd64,linux/arm64 --production-image --verbose
58 changes: 34 additions & 24 deletions scripts/ci/libraries/_build_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -422,19 +422,19 @@ function build_images::rebuild_ci_image_if_needed_with_group() {
}

# Builds CI image - depending on the caching strategy (pulled, local, disabled) it
# passes the necessary docker build flags via docker_ci_cache_directive array
# passes the necessary docker build flags via docker_ci_directive array
# it also passes the right Build args depending on the configuration of the build
# selected by Breeze flags or environment variables.
function build_images::build_ci_image() {
build_images::check_if_buildx_plugin_available
build_images::print_build_info
local docker_ci_cache_directive
local docker_ci_directive
if [[ "${DOCKER_CACHE}" == "disabled" ]]; then
docker_ci_cache_directive=("--no-cache")
docker_ci_directive=("--no-cache")
elif [[ "${DOCKER_CACHE}" == "local" ]]; then
docker_ci_cache_directive=()
docker_ci_directive=()
elif [[ "${DOCKER_CACHE}" == "pulled" ]]; then
docker_ci_cache_directive=(
docker_ci_directive=(
"--cache-from=${AIRFLOW_CI_IMAGE}:cache"
)
else
Expand All @@ -446,10 +446,19 @@ function build_images::build_ci_image() {
if [[ ${PREPARE_BUILDX_CACHE} == "true" ]]; then
# we need to login to docker registry so that we can push cache there
build_images::login_to_docker_registry
docker_ci_cache_directive+=(
docker_ci_directive+=(
"--cache-to=type=registry,ref=${AIRFLOW_CI_IMAGE}:cache"
"--load"
"--push"
)
if [[ ${PLATFORM} =~ .*,.* ]]; then
echo
echo "Skip loading docker image on multi-platform build"
echo
else
docker_ci_directive+=(
"--load"
)
fi
fi
local extra_docker_ci_flags=()
if [[ ${CI} == "true" ]]; then
Expand Down Expand Up @@ -506,14 +515,10 @@ function build_images::build_ci_image() {
--build-arg COMMIT_SHA="${COMMIT_SHA}" \
"${additional_dev_args[@]}" \
"${additional_runtime_args[@]}" \
"${docker_ci_cache_directive[@]}" \
"${docker_ci_directive[@]}" \
-t "${AIRFLOW_CI_IMAGE}" \
--target "main" \
. -f Dockerfile.ci
if [[ ${PREPARE_BUILDX_CACHE} == "true" ]]; then
# Push the image as "latest" so that it can be used in Breeze
docker_v push "${AIRFLOW_CI_IMAGE}"
fi
set -u
if [[ -n "${IMAGE_TAG=}" ]]; then
echo "Tagging additionally image ${AIRFLOW_CI_IMAGE} with ${IMAGE_TAG}"
Expand Down Expand Up @@ -573,7 +578,7 @@ function build_images::prepare_prod_build() {
}

# Builds PROD image - depending on the caching strategy (pulled, local, disabled) it
# passes the necessary docker build flags via DOCKER_CACHE_PROD_DIRECTIVE and
# passes the necessary docker build flags via docker_prod_directive and
# docker_cache_prod_build_directive (separate caching options are needed for "build" segment of the image)
# it also passes the right Build args depending on the configuration of the build
# selected by Breeze flags or environment variables.
Expand All @@ -588,14 +593,15 @@ function build_images::build_prod_images() {
echo
return
fi
local docker_cache_prod_directive
local docker_prod_directive
if [[ "${DOCKER_CACHE}" == "disabled" ]]; then
docker_cache_prod_directive=("--no-cache")
docker_prod_directive=("--no-cache")
elif [[ "${DOCKER_CACHE}" == "local" ]]; then
docker_cache_prod_directive=()
docker_prod_directive=()
elif [[ "${DOCKER_CACHE}" == "pulled" ]]; then
docker_cache_prod_directive=(
docker_prod_directive=(
"--cache-from=${AIRFLOW_PROD_IMAGE}:cache"
"--push"
)
else
echo
Expand All @@ -608,10 +614,18 @@ function build_images::build_prod_images() {
# we need to login to docker registry so that we can push cache there
build_images::login_to_docker_registry
# Cache for prod image contains also build stage for buildx when mode=max specified!
docker_cache_prod_directive+=(
docker_prod_directive+=(
"--cache-to=type=registry,ref=${AIRFLOW_PROD_IMAGE}:cache,mode=max"
"--load"
)
if [[ ${PLATFORM} =~ .*,.* ]]; then
echo
echo "Skip loading docker image on multi-platform build"
echo
else
docker_prod_directive+=(
"--load"
)
fi
fi
set +u
local additional_dev_args=()
Expand Down Expand Up @@ -661,14 +675,10 @@ function build_images::build_prod_images() {
--build-arg AIRFLOW_IMAGE_README_URL="https://raw.githubusercontent.com/apache/airflow/${COMMIT_SHA}/docs/docker-stack/README.md" \
"${additional_dev_args[@]}" \
"${additional_runtime_args[@]}" \
"${docker_cache_prod_directive[@]}" \
"${docker_prod_directive[@]}" \
-t "${AIRFLOW_PROD_IMAGE}" \
--target "main" \
. -f Dockerfile
if [[ ${PREPARE_BUILDX_CACHE} == "true" ]]; then
# Push the image as "latest" so that it can be used in Breeze
docker_v push "${AIRFLOW_PROD_IMAGE}"
fi
set -u
if [[ -n "${IMAGE_TAG=}" ]]; then
echo "Tagging additionally image ${AIRFLOW_PROD_IMAGE} with ${IMAGE_TAG}"
Expand Down
Loading

0 comments on commit 828d1cb

Please sign in to comment.