Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update wordcount example to use docker-flink image #135 #136

Merged
merged 1 commit into from
Nov 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 12 additions & 58 deletions examples/wordcount/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,34 +1,14 @@
FROM openjdk:8-jdk
FROM flink:1.8.2-scala_2.12

# Prepare environment
ENV FLINK_HOME=/opt/flink
ENV MAVEN_HOME=/opt/maven
ENV HADOOP_HOME=/opt/hadoop
ENV PATH=$FLINK_HOME/bin:$HADOOP_HOME/bin:$MAVEN_HOME/bin:$PATH

COPY . /code

# Configure Flink version
ENV FLINK_VERSION=1.8.2 \
HADOOP_SCALA_VARIANT=scala_2.12
ENV PATH=$MAVEN_HOME/bin:$PATH

# Install dependencies
RUN set -ex; \
apt-get update; \
apt-get -y install libsnappy1v5; \
apt-get -y install netcat net-tools; \
apt-get -y install gettext-base; \
rm -rf /var/lib/apt/lists/*

# Grab gosu for easy step-down from root
ENV GOSU_VERSION 1.11
RUN set -ex; \
wget -nv -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$(dpkg --print-architecture)"; \
wget -nv -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$(dpkg --print-architecture).asc"; \
export GNUPGHOME="$(mktemp -d)"; \
rm -rf "$GNUPGHOME" /usr/local/bin/gosu.asc; \
chmod +x /usr/local/bin/gosu; \
gosu nobody true
apt-get update \
&& apt-get -y install gettext-base openjdk-8-jdk-headless \
&& rm -rf /var/lib/apt/lists/*

# Install Maven
ENV MAVEN_VERSION 3.6.1
Expand All @@ -38,39 +18,13 @@ RUN \
mv apache-maven-$MAVEN_VERSION $MAVEN_HOME; \
rm apache-maven-$MAVEN_VERSION-bin.tar.gz

# Build application jar
COPY . /code
WORKDIR /code

RUN \
mvn package; \
mkdir -p /opt/flink/flink-web-upload; \
cp flink-conf.yaml /usr/local/; \
cp /code/target/*.jar /opt/flink/flink-web-upload/

RUN groupadd --system --gid=9999 flink && \
useradd --system --home-dir $FLINK_HOME --uid=9999 --gid=flink flink
WORKDIR $FLINK_HOME

ENV FLINK_URL_FILE_PATH=flink/flink-${FLINK_VERSION}/flink-${FLINK_VERSION}-bin-${HADOOP_SCALA_VARIANT}.tgz
ENV FLINK_TGZ_URL=https://archive.apache.org/dist/$FLINK_URL_FILE_PATH

# Install Flink
RUN set -ex; \
wget -nv -O flink.tgz "$FLINK_TGZ_URL"; \
\
tar -xf flink.tgz --strip-components=1; \
rm flink.tgz; \
\
cp ./opt/flink-s3-fs-presto-${FLINK_VERSION}.jar ./lib/;\
cp ./opt/flink-s3-fs-hadoop-${FLINK_VERSION}.jar ./lib/;\
\
chown -R flink:flink .;

# Needed on OpenShift for the entrypoint script to work
RUN chmod -R 777 /opt/flink
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn package \
&& ln -s /code/target $FLINK_HOME/flink-web-upload

# control script expects manifest.yaml at this location
RUN chown -R flink:flink /var
COPY docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 6123 8081
CMD ["local"]
COPY docker-entrypoint.sh /flinkk8soperator-entrypoint.sh
ENTRYPOINT ["/flinkk8soperator-entrypoint.sh"]
CMD ["help"]
45 changes: 5 additions & 40 deletions examples/wordcount/docker-entrypoint.sh
Original file line number Diff line number Diff line change
@@ -1,45 +1,10 @@
#!/bin/sh

drop_privs_cmd() {
if [ $(id -u) != 0 ]; then
# Don't need to drop privs if EUID != 0
return
elif [ -x /sbin/su-exec ]; then
# Alpine
echo su-exec
else
# Others
echo gosu flink
fi
}

# Add in extra configs set by the operator
# Map config from FlinkK8sOperator to base container
# https://github.com/lyft/flinkk8soperator/issues/135
# https://github.com/docker-flink/docker-flink/pull/91
if [ -n "$OPERATOR_FLINK_CONFIG" ]; then
echo "$OPERATOR_FLINK_CONFIG" >> "/usr/local/flink-conf.yaml"
fi

envsubst < /usr/local/flink-conf.yaml > $FLINK_HOME/conf/flink-conf.yaml

COMMAND=$@

if [ $# -lt 1 ]; then
COMMAND="local"
fi

if [ "$COMMAND" = "help" ]; then
echo "Usage: $(basename "$0") (jobmanager|taskmanager|local|help)"
exit 0
elif [ "$FLINK_DEPLOYMENT_TYPE" = "jobmanager" ]; then
echo "Starting Job Manager"
echo "config file: " && grep '^[^\n#]' "$FLINK_HOME/conf/flink-conf.yaml"
exec $(drop_privs_cmd) "$FLINK_HOME/bin/jobmanager.sh" start-foreground
elif [ "$FLINK_DEPLOYMENT_TYPE" = "taskmanager" ]; then
echo "Starting Task Manager"
echo "config file: " && grep '^[^\n#]' "$FLINK_HOME/conf/flink-conf.yaml"
exec $(drop_privs_cmd) "$FLINK_HOME/bin/taskmanager.sh" start-foreground
elif [ "$COMMAND" = "local" ]; then
echo "Starting local cluster"
exec $(drop_privs_cmd) "$FLINK_HOME/bin/jobmanager.sh" start-foreground local
export FLINK_PROPERTIES="`echo \"${OPERATOR_FLINK_CONFIG}\" | envsubst`"
fi

exec "$@"
exec /docker-entrypoint.sh "$@"
31 changes: 0 additions & 31 deletions examples/wordcount/flink-conf.yaml

This file was deleted.

8 changes: 6 additions & 2 deletions examples/wordcount/flink-operator-custom-resource.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,27 @@ metadata:
environment: development
spec:
image: docker.io/lyft/wordcount-operator-example:{sha}
deleteMode: None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job finishes and the operator tries to take a savepoint. That's a bug that needs to be fixed separately.

For the demo, the application should just be deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it has a bounded input. We should probably fix that, as the operator doesn't (intentionally) support jobs that complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't see why the operator shouldn't support such jobs. Created #138 to follow up on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean any support currently in the operator is accidental; we haven't tested this use case or thought deeply about how to support it. Would be good to have better support but it's not a use case internally.

flinkConfig:
taskmanager.heap.size: 200
taskmanager.network.memory.fraction: 0.1
taskmanager.network.memory.min: 10m
state.backend.fs.checkpointdir: file:///checkpoints/flink/checkpoints
state.checkpoints.dir: file:///checkpoints/flink/externalized-checkpoints
state.savepoints.dir: file:///checkpoints/flink/savepoints
web.upload.dir: /opt/flink
jobManagerConfig:
resources:
requests:
memory: "200Mi"
cpu: "0.2"
cpu: "0.1"
replicas: 1
taskManagerConfig:
taskSlots: 2
resources:
requests:
memory: "200Mi"
cpu: "0.2"
cpu: "0.1"
flinkVersion: "1.8"
jarName: "wordcount-operator-example-1.0.0-SNAPSHOT.jar"
parallelism: 3
Expand Down