Skip to content

Commit

Permalink
Docker enhancement #1277 (#1278)
Browse files Browse the repository at this point in the history
* Add multiple build platforms

* Automatically update Dockerhub description

* Launch Python instead of Bash by default

* Change `omlp` directory name to less cryptic `openml`

* Change directory to `openml` for running purpose of running script

For mounted scripts, instructions say to mount them to `/openml`,
so we have to `cd` before invoking `python`.

* Update readme to reflect updates (python by default, rename dirs)

* Add branch/code for doc and test examples as they are required

* Ship docker images with readme

* Only update readme on release, also try build docker on PR

* Update the toc descriptions
  • Loading branch information
PGijsbers authored and eddiebergman committed Jan 18, 2024
1 parent 2801e9d commit 241608a
Show file tree
Hide file tree
Showing 4 changed files with 135 additions and 68 deletions.
20 changes: 18 additions & 2 deletions .github/workflows/release_docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,13 @@ name: release-docker
on:
push:
branches:
- 'main'
- 'develop'
- 'docker'
tags:
- 'v*'
pull_request:
branches:
- 'develop'

jobs:

Expand All @@ -21,6 +25,7 @@ jobs:
uses: docker/setup-buildx-action@v2

- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
Expand All @@ -40,9 +45,20 @@ jobs:
uses: docker/build-push-action@v4
with:
context: ./docker/
push: true
tags: ${{ steps.meta_dockerhub.outputs.tags }}
labels: ${{ steps.meta_dockerhub.outputs.labels }}
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name == 'push' }}

- name: Update repo description
if: ${{ startsWith(github.ref, 'refs/tags/v') }}
uses: peter-evans/dockerhub-description@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
repository: openml/openml-python
short-description: "pre-installed openml-python environment"
readme-filepath: ./docker/readme.md

- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
6 changes: 4 additions & 2 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@
# Useful building docs or running unix tests from a Windows host.
FROM python:3.10

RUN git clone https://github.com/openml/openml-python.git omlp
WORKDIR omlp
RUN git clone https://github.com/openml/openml-python.git openml
WORKDIR openml
RUN python -m venv venv
RUN venv/bin/pip install wheel setuptools
RUN venv/bin/pip install -e .[test,examples,docs,examples_unix]

WORKDIR /
RUN mkdir scripts
ADD startup.sh scripts/
ADD readme.md /

# Due to the nature of the Docker container it might often be built from Windows.
# It is typical to have the files with \r\n line-ending, we want to remove it for the unix image.
RUN sed -i 's/\r//g' scripts/startup.sh
Expand Down
153 changes: 99 additions & 54 deletions docker/readme.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,131 @@
# OpenML Python Container

This docker container has the latest development version of openml-python downloaded and pre-installed.
It can be used to run the unit tests or build the docs in a fresh and/or isolated unix environment.
Instructions only tested on a Windows host machine.
This docker container has the latest version of openml-python downloaded and pre-installed.
It can also be used by developers to run unit tests or build the docs in
a fresh and/or isolated unix environment.
This document contains information about:

First pull the docker image:
1. [Usage](#usage): how to use the image and its main modes.
2. [Using local or remote code](#using-local-or-remote-code): useful when testing your own latest changes.
3. [Versions](#versions): identify which image to use.
4. [Development](#for-developers): information about the Docker image for developers.

docker pull openml/openml-python
*note:* each docker image is shipped with a readme, which you can read with:
`docker run --entrypoint=/bin/cat openml/openml-python:TAG readme.md`

## Usage

There are three main ways to use the image: running a pre-installed Python environment,
running tests, and building documentation.

docker run -it openml/openml-python [DOC,TEST] [BRANCH]
### Running `Python` with pre-installed `OpenML-Python` (default):

The image is designed to work with two specified directories which may be mounted ([`docker --mount documentation`](https://docs.docker.com/storage/bind-mounts/#start-a-container-with-a-bind-mount)).
You can mount your openml-python folder to the `/code` directory to run tests or build docs on your local files.
You can mount an `/output` directory to which the container will write output (currently only used for docs).
Each can be mounted by adding a `--mount type=bind,source=SOURCE,destination=/DESTINATION` where `SOURCE` is the absolute path to your code or output directory, and `DESTINATION` is either `code` or `output`.

E.g. mounting a code directory:
To run `Python` with a pre-installed `OpenML-Python` environment run:

docker run -i --mount type=bind,source="E:\\repositories/openml-python",destination="/code" -t openml/openml-python
```text
docker run -it openml/openml-python
```

E.g. mounting an output directory:
this accepts the normal `Python` arguments, e.g.:

docker run -i --mount type=bind,source="E:\\files/output",destination="/output" -t openml/openml-python
```text
docker run openml/openml-python -c "import openml; print(openml.__version__)"
```

You can mount both at the same time.
if you want to run a local script, it needs to be mounted first. Mount it into the
`openml` folder:

### Bash (default)
By default bash is invoked, you should also use the `-i` flag when starting the container so it processes input:
```
docker run -v PATH/TO/FILE:/openml/MY_SCRIPT.py openml/openml-python MY_SCRIPT.py
```

docker run -it openml/openml-python
### Running unit tests

### Building Documentation
There are two ways to build documentation, either directly from the `HEAD` of a branch on Github or from your local directory.
You can run the unit tests by passing `test` as the first argument.
It also requires a local or remote repository to be specified, which is explained
[below]((#using-local-or-remote-code). For this example, we specify to test the
`develop` branch:

#### Building from a local repository
Building from a local directory requires you to mount it to the ``/code`` directory:
```text
docker run openml/openml-python test develop
```

docker run --mount type=bind,source=PATH_TO_REPOSITORY,destination=/code -t openml/openml-python doc
### Building documentation

The produced documentation will be in your repository's ``doc/build`` folder.
If an `/output` folder is mounted, the documentation will *also* be copied there.
You can build the documentation by passing `doc` as the first argument,
you should [mount]((https://docs.docker.com/storage/bind-mounts/#start-a-container-with-a-bind-mount))
an output directory in which the docs will be stored. You also need to provide a remote
or local repository as explained in [the section below]((#using-local-or-remote-code).
In this example, we build documentation for the `develop` branch.
On Windows:

#### Building from an online repository
Building from a remote repository requires you to specify a branch.
The branch may be specified by name directly if it exists on the original repository (https://github.com/openml/openml-python/):
```text
docker run --mount type=bind,source="E:\\files/output",destination="/output" openml/openml-python doc develop
```

docker run --mount type=bind,source=PATH_TO_OUTPUT,destination=/output -t openml/openml-python doc BRANCH
on Linux:
```text
docker run --mount type=bind,source="./output",destination="/output" openml/openml-python doc develop
```

see [the section below]((#using-local-or-remote-code) for running against local changes
or a remote branch.

Where `BRANCH` is the name of the branch for which to generate the documentation.
It is also possible to build the documentation from the branch on a fork, in this case the `BRANCH` should be specified as `GITHUB_NAME#BRANCH` (e.g. `PGijsbers#my_feature`) and the name of the forked repository should be `openml-python`.
*Note: you can forgo mounting an output directory to test if the docs build successfully,
but the result will only be available within the docker container under `/openml/docs/build`.*

### Running tests
There are two ways to run tests, either directly from the `HEAD` of a branch on Github or from your local directory.
It works similar to building docs, but should specify `test` as mode.
For example, to run tests on your local repository:
## Using local or remote code

docker run --mount type=bind,source=PATH_TO_REPOSITORY,destination=/code -t openml/openml-python test

Running tests from the state of an online repository is supported similar to building documentation (i.e. specify `BRANCH` instead of mounting `/code`).

## Troubleshooting
You can build docs or run tests against your local repository or a Github repository.
In the examples below, change the `source` to match the location of your local repository.

### Using a local repository

To use a local directory, mount it in the `/code` directory, on Windows:

```text
docker run --mount type=bind,source="E:\\repositories/openml-python",destination="/code" openml/openml-python test
```

When you are mounting a directory you can check that it is mounted correctly by running the image in bash mode.
Navigate to the `/code` and `/output` directories and see if the expected files are there.
If e.g. there is no code in your mounted `/code`, you should double-check the provided path to your host directory.
on Linux:
```text
docker run --mount type=bind,source="/Users/pietergijsbers/repositories/openml-python",destination="/code" openml/openml-python test
```

## Notes for developers
This section contains some notes about the structure of the image, intended for those who want to work on it.
when building docs, you also need to mount an output directory as shown above, so add both:

```text
docker run --mount type=bind,source="./output",destination="/output" --mount type=bind,source="/Users/pietergijsbers/repositories/openml-python",destination="/code" openml/openml-python doc
```

### Using a Github repository
Building from a remote repository requires you to specify a branch.
The branch may be specified by name directly if it exists on the original repository (https://github.com/openml/openml-python/):

docker run --mount type=bind,source=PATH_TO_OUTPUT,destination=/output openml/openml-python [test,doc] BRANCH

Where `BRANCH` is the name of the branch for which to generate the documentation.
It is also possible to build the documentation from the branch on a fork,
in this case the `BRANCH` should be specified as `GITHUB_NAME#BRANCH` (e.g.
`PGijsbers#my_feature_branch`) and the name of the forked repository should be `openml-python`.

## For developers
This section contains some notes about the structure of the image,
intended for those who want to work on it.

### Added Directories
The `openml/openml-python` image is built on a vanilla `python:3` image.
Additionally it contains the following files are directories:

- `/omlp`: contains the openml-python repository in the state with which the image was built by default.
If working with a `BRANCH`, this repository will be set to the `HEAD` of `BRANCH`.
- `/omlp/venv/`: contains the used virtual environment for `doc` and `test`. It has `openml-python` dependencies pre-installed.
When invoked with `doc` or `test`, the dependencies will be updated based on the `setup.py` of the `BRANCH` or mounted `/code`.
Additionally, it contains the following files are directories:

- `/openml`: contains the openml-python repository in the state with which the image
was built by default. If working with a `BRANCH`, this repository will be set to
the `HEAD` of `BRANCH`.
- `/openml/venv/`: contains the used virtual environment for `doc` and `test`. It has
`openml-python` dependencies pre-installed. When invoked with `doc` or `test`, the
dependencies will be updated based on the `setup.py` of the `BRANCH` or mounted `/code`.
- `/scripts/startup.sh`: the entrypoint of the image. Takes care of the automated features (e.g. `doc` and `test`).

## Building the image
To build the image yourself, execute `docker build -f Dockerfile .` from this directory.
It will use the `startup.sh` as is, so any local changes will be present in the image.
To build the image yourself, execute `docker build -f Dockerfile .` from the `docker`
directory of the `openml-python` repository. It will use the `startup.sh` as is, so any
local changes will be present in the image.
24 changes: 14 additions & 10 deletions docker/startup.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Entry script to switch between the different Docker functionalities.
# By default, execute Python with OpenML pre-installed
#
# Entry script to allow docker to be ran for bash, tests and docs.
# The script assumes a code repository can be mounted to ``/code`` and an output directory to ``/output``.
# Executes ``mode`` on ``branch`` or the provided ``code`` directory.
Expand All @@ -10,10 +13,11 @@
# Can be a branch on a Github fork, specified with the USERNAME#BRANCH format.
# The test or doc build is executed on this branch.

if [ -z "$1" ]; then
echo "Executing in BASH mode."
bash
exit
if [[ ! ( $1 = "doc" || $1 = "test" ) ]]; then
cd openml
source venv/bin/activate
python "$@"
exit 0
fi

# doc and test modes require mounted directories and/or specified branches
Expand All @@ -32,8 +36,8 @@ if [ "$1" == "doc" ] && [ -n "$2" ] && ! [ -d "/output" ]; then
fi

if [ -n "$2" ]; then
# if a branch is provided, we will pull it into the `omlp` local repository that was created with the image.
cd omlp
# if a branch is provided, we will pull it into the `openml` local repository that was created with the image.
cd openml
if [[ $2 == *#* ]]; then
# If a branch is specified on a fork (with NAME#BRANCH format), we have to construct the url before pulling
# We add a trailing '#' delimiter so the second element doesn't get the trailing newline from <<<
Expand All @@ -52,12 +56,12 @@ if [ -n "$2" ]; then
exit 1
fi
git pull
code_dir="/omlp"
code_dir="/openml"
else
code_dir="/code"
fi

source /omlp/venv/bin/activate
source /openml/venv/bin/activate
cd $code_dir
# The most recent ``main`` is already installed, but we want to update any outdated dependencies
pip install -e .[test,examples,docs,examples_unix]
Expand All @@ -71,6 +75,6 @@ if [ "$1" == "doc" ]; then
make html
make linkcheck
if [ -d "/output" ]; then
cp -r /omlp/doc/build /output
cp -r /openml/doc/build /output
fi
fi
fi

0 comments on commit 241608a

Please sign in to comment.