Skip to content

Commit

Permalink
[SPARK-37319][K8S] Support K8s image building with Java 17
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR aims to support K8s image building with Java 17.
Please note that we need more efforts to achieve to run all tests successfully.

### Why are the changes needed?

`OpenJDK` docker hub image switches the underlying OS from `Debian` to `OracleLinux` since Java 12.
So, `java_image_tag` doesn't work any longer.

**BEFORE**
```
$ bin/docker-image-tool.sh -n -b java_image_tag=17 build
[+] Building 0.8s (6/17)
 => [internal] load build definition from Dockerfile                                                                                                 0.0s
 => => transferring dockerfile: 37B                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                    0.0s
 => => transferring context: 2B                                                                                                                      0.0s
 => [internal] load metadata for docker.io/library/openjdk:17                                                                                        0.4s
 => CACHED [ 1/13] FROM docker.io/library/openjdk:17sha256:c7fffc2024948e6d75922025a17b7d81cb747fd0fe0167fef13c6fcfc72e4144                         0.0s
 => [internal] load build context                                                                                                                    0.1s
 => => transferring context: 69.25kB                                                                                                                 0.0s
 => ERROR [ 2/13] RUN set -ex &&     sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list &&     apt-get update &&     ln -s /li  0.2s
------
 > [ 2/13] RUN set -ex &&     sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list &&     apt-get update &&     ln -s /lib /lib64 &&     apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps &&     mkdir -p /opt/spark &&     mkdir -p /opt/spark/examples &&     mkdir -p /opt/spark/work-dir &&     touch /opt/spark/RELEASE &&     rm /bin/sh &&     ln -sv /bin/bash /bin/sh &&     echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su &&     chgrp root /etc/passwd && chmod ug+rw /etc/passwd &&     rm -rf /var/cache/apt/*:
#5 0.230 + sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list
#5 0.232 sed: can't read /etc/apt/sources.list: No such file or directory
------
executor failed running [/bin/sh -c set -ex &&     sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list &&     apt-get update &&     ln -s /lib /lib64 &&     apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps &&     mkdir -p /opt/spark &&     mkdir -p /opt/spark/examples &&     mkdir -p /opt/spark/work-dir &&     touch /opt/spark/RELEASE &&     rm /bin/sh &&     ln -sv /bin/bash /bin/sh &&     echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su &&     chgrp root /etc/passwd && chmod ug+rw /etc/passwd &&     rm -rf /var/cache/apt/*]: exit code: 2
Failed to build Spark JVM Docker image, please refer to Docker build output for details.
```

**AFTER (This PR with `-f` option)**
```
$ bin/docker-image-tool.sh -n -f kubernetes/dockerfiles/spark/Dockerfile.java17 build
[+] Building 29.3s (19/19) FINISHED
 => [internal] load build definition from Dockerfile.java17                                                                                          0.0s
 => => transferring dockerfile: 2.49kB                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                    0.0s
 => => transferring context: 2B                                                                                                                      0.0s
 => [internal] load metadata for docker.io/library/debian:bullseye-slim                                                                              1.5s
 => [auth] library/debian:pull token for registry-1.docker.io                                                                                        0.0s
 => [internal] load build context                                                                                                                    0.1s
 => => transferring context: 80.54kB                                                                                                                 0.0s
 => CACHED [ 1/13] FROM docker.io/library/debian:bullseye-slimsha256:dddc0f5f01db7ca3599fd8cf9821ffc4d09ec9d7d15e49019e73228ac1eee7f9               0.0s
 => [ 2/13] RUN set -ex &&     apt-get update &&     ln -s /lib /lib64 &&     apt install -y bash tini libc6 libpam-modules krb5-user libnss3 proc  25.5s
 => [ 3/13] COPY jars /opt/spark/jars                                                                                                                0.4s
 => [ 4/13] COPY bin /opt/spark/bin                                                                                                                  0.0s
 => [ 5/13] COPY sbin /opt/spark/sbin                                                                                                                0.0s
 => [ 6/13] COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/                                                                                    0.0s
 => [ 7/13] COPY kubernetes/dockerfiles/spark/decom.sh /opt/                                                                                         0.0s
 => [ 8/13] COPY examples /opt/spark/examples                                                                                                        0.0s
 => [ 9/13] COPY kubernetes/tests /opt/spark/tests                                                                                                   0.0s
 => [10/13] COPY data /opt/spark/data                                                                                                                0.0s
 => [11/13] WORKDIR /opt/spark/work-dir                                                                                                              0.0s
 => [12/13] RUN chmod g+w /opt/spark/work-dir                                                                                                        0.2s
 => [13/13] RUN chmod a+x /opt/decom.sh                                                                                                              0.2s
 => exporting to image                                                                                                                               1.3s
 => => exporting layers                                                                                                                              1.3s
 => => writing image sha256:ec961d957826c9b7eb4d00e900262130fc1708aef6cb51298b627d4bc91f834b                                                         0.0s
 => => naming to docker.io/library/spark                                                                                                             0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
```

### Does this PR introduce _any_ user-facing change?

Yes, this is a new docker file exposed to the customer.

### How was this patch tested?

Pass the K8s IT building.

Closes #34586 from dongjoon-hyun/SPARK-37319.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Kousuke Saruta <[email protected]>
  • Loading branch information
dongjoon-hyun authored and sarutak committed Nov 15, 2021
1 parent edbc7cf commit bb9e1d9
Show file tree
Hide file tree
Showing 2 changed files with 70 additions and 3 deletions.
11 changes: 8 additions & 3 deletions bin/docker-image-tool.sh
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,8 @@ Commands:
push Push a pre-built image to a registry. Requires a repository address to be provided.
Options:
-f file Dockerfile to build for JVM based Jobs. By default builds the Dockerfile shipped with Spark.
-f file (Optional) Dockerfile to build for JVM based Jobs. By default builds the Dockerfile shipped with Spark.
For Java 17, use `-f kubernetes/dockerfiles/spark/Dockerfile.java17`
-p file (Optional) Dockerfile to build for PySpark Jobs. Builds Python dependencies and ships with Spark.
Skips building PySpark docker image if not specified.
-R file (Optional) Dockerfile to build for SparkR Jobs. Builds R dependencies and ships with Spark.
Expand Down Expand Up @@ -267,15 +268,19 @@ Examples:
$0 -r docker.io/myrepo -t v2.3.0 build
$0 -r docker.io/myrepo -t v2.3.0 push
- Build and push JDK11-based image with tag "v3.0.0" to docker.io/myrepo
- Build and push Java11-based image with tag "v3.0.0" to docker.io/myrepo
$0 -r docker.io/myrepo -t v3.0.0 -b java_image_tag=11-jre-slim build
$0 -r docker.io/myrepo -t v3.0.0 push
- Build and push JDK11-based image for multiple archs to docker.io/myrepo
- Build and push Java11-based image for multiple archs to docker.io/myrepo
$0 -r docker.io/myrepo -t v3.0.0 -X -b java_image_tag=11-jre-slim build
# Note: buildx, which does cross building, needs to do the push during build
# So there is no separate push step with -X
- Build and push Java17-based image with tag "v3.3.0" to docker.io/myrepo
$0 -r docker.io/myrepo -t v3.3.0 -f kubernetes/dockerfiles/spark/Dockerfile.java17 build
$0 -r docker.io/myrepo -t v3.3.0 push
EOF
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# We need to build from debian:bullseye-slim because openjdk switches its underlying OS
# from debian to oraclelinux from openjdk:12
FROM debian:bullseye-slim

ARG spark_uid=185

# Before building the docker image, first build and make a Spark distribution following
# the instructions in http://spark.apache.org/docs/latest/building-spark.html.
# If this docker file is being used in the context of building your images from a Spark
# distribution, the docker build command should be invoked from the top level directory
# of the Spark distribution. E.g.:
# docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .

RUN set -ex && \
apt-get update && \
ln -s /lib /lib64 && \
apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps openjdk-17-jre && \
mkdir -p /opt/spark && \
mkdir -p /opt/spark/examples && \
mkdir -p /opt/spark/work-dir && \
touch /opt/spark/RELEASE && \
rm /bin/sh && \
ln -sv /bin/bash /bin/sh && \
echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
rm -rf /var/cache/apt/*

COPY jars /opt/spark/jars
COPY bin /opt/spark/bin
COPY sbin /opt/spark/sbin
COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/
COPY kubernetes/dockerfiles/spark/decom.sh /opt/
COPY examples /opt/spark/examples
COPY kubernetes/tests /opt/spark/tests
COPY data /opt/spark/data

ENV SPARK_HOME /opt/spark

WORKDIR /opt/spark/work-dir
RUN chmod g+w /opt/spark/work-dir
RUN chmod a+x /opt/decom.sh

ENTRYPOINT [ "/opt/entrypoint.sh" ]

# Specify the User that the actual main process will run as
USER ${spark_uid}

0 comments on commit bb9e1d9

Please sign in to comment.