Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Alpine-based image with a Debian-based image #21

Merged
merged 1 commit into from
Feb 5, 2021
Merged

Conversation

huddlej
Copy link
Contributor

@huddlej huddlej commented Mar 4, 2020

Alpine Linux uses musl libc instead of glibc which means it cannot use most
pre-built Python wheels for glibc-based Linux distributions [1]. As a result, Python
packages need to be built from source, leading to longer build times and
increased complexity in the Dockerfile for more complex packages. For complex
packages like opencv-python, building from source is especially non-trivial.

This commit ports the existing Alpine-based Dockerfile to a minimal Debian-based
file with pre-installed Python 3 (the official Python Docker image) and removes
dependencies that were originally required to build Python packages from
sources (e.g., cvxopt) where installation from a wheel is now possible.

This commit also removes all explicit references to Python 2 dependencies, as
fauna is compatible enough with Python 3 for downloading to work and sacra is
not actively under development.

This new image has been tested with the zika-tutorial, zika, and seasonal-flu
repositories.

[1] docker-library/docs#904

@huddlej huddlej requested a review from tsibley March 4, 2020 21:11
Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had a chance to test this, but reading through your changes this looks like a great direction. 💯

It looks like the final, compressed image size is 5MB smaller than the current image from master! Nice!

It might also be informative to build this image locally and look at its docker image history which will show the sizes of all the intermediate layers. That can be useful for identifying large layers for further inspection.

I was slightly worried that not having a Python 2 install would break some repos/uses of the image which expected python2 to exist, but I don't actually find any of those when searching the github.com/nextstrain repos.

Dockerfile Outdated Show resolved Hide resolved
@@ -143,6 +97,13 @@ RUN pip3 install --requirement=/nextstrain/fauna/requirements.txt
# accessible and importable.
RUN pip3 install --editable /nextstrain/augur

# Install additional "full" dependencies for augur.
# TODO: these versions should be updated in augur's setup.py to work with the
# pip install nextstrain-augur[full] approach.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for updating setup.py to declare these extras and then using that extra tag here.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
@tsibley
Copy link
Member

tsibley commented Mar 18, 2020

@huddlej I really want to merge this and leave Alpine behind, but I'm hesitant because it seems like a large underlying change that could break existing nextstrain build uses between updates. Any thoughts here? Should we just go for it and commit to quickly fixing any problems that arise?

@huddlej
Copy link
Contributor Author

huddlej commented Mar 18, 2020

@tsibley I feel exactly the same way! I would prefer for this image to get more rigorous testing before we merge.

For local builds, it should be as easy as this, right?

# Pull the new image
docker pull nextstrain/base:branch-python-base

# Run a build with the new image
nextstrain build --image nextstrain/base:branch-python-base .

Maybe @lmoncla would be able to try it out since she just got ncov builds up and running. I'll ping her about this.

@tsibley
Copy link
Member

tsibley commented Mar 18, 2020

That nextstrain command should do it. nextstrain will automatically pull down the image if needed, so not necessary to run docker pull at all.

An important caveat is that builds run via --aws-batch won't use the requested custom image because of nextstrain/cli#57.

@jstoja
Copy link

jstoja commented Apr 4, 2020

Hello @huddlej @tsibley,

I've just tested this PR's Docker Image with the the current nextstrain/cli master commit (1.16.2-6-g8c72b6b) and the zika-tutorial build (I've tried with the ncov one, but I'm missing credentials it seems).
It seemed to work as expected (I was able to run the build command, but also view the build in my browser). Although, it complained about narratives, but I think it is expected.

Please let me know if you have other things you need to test.

Thanks

@huddlej
Copy link
Contributor Author

huddlej commented Apr 4, 2020

Thank you for testing this out, @jstoja! Would you be willing to try the same type of test with our mumps build?

We think this mumps build will not work with the image defined in this PR, but it would be helpful to get confirmation about how this build is failing (assuming it does so with the example data).

@jstoja
Copy link

jstoja commented Apr 4, 2020

No problem, I'll test this one too.

Small additional comment, I can see that in the cli repository, TravisCI is using Ubuntu and this Dockerfile aims to use Debian instead. Maybe the same should be used to avoid any future issues. Ubuntu also provides slim images, so something very similar could be done.

@tsibley
Copy link
Member

tsibley commented Apr 7, 2020

Small additional comment, I can see that in the cli repository, TravisCI is using Ubuntu and this Dockerfile aims to use Debian instead. Maybe the same should be used to avoid any future issues. Ubuntu also provides slim images, so something very similar could be done.

Right, the CI tests in nextstrain/cli run in Ubuntu (e.g. the commands inside .travis.yml), but when the example build using Docker is run, it'll run inside the container, which would be Debian with this PR. I don't think this is an issue for testing, and generally we do expect that the container OS won't match the host OS. Are there particular future issues you think might crop up?

@jstoja
Copy link

jstoja commented Apr 8, 2020

The only issues that I might see is the dynamic libraries being different, it might cause issues. Albeit this should be VERY rare, so probably not a huge issue.

@jstoja
Copy link

jstoja commented Apr 9, 2020

We think this mumps build will not work with the image defined in this PR, but it would be helpful to get confirmation about how this build is failing (assuming it does so with the example data).

So I've been testing this a bit more and what fails currently is an augur filter job that complains about group-by categories.
I assume this is related to the fact that I don't have a populated fauna and therefore the fasta file is empty.

I'll try to set this up and continue the testing.

@tsibley
Copy link
Member

tsibley commented Jan 15, 2021

Rebased on top of the latest master so I could use this image for some ad-hoc builds.

@huddlej I'd still love to pick a week to make this the default image (by merging it) and be ready to quickly turnaround hot fixes if/when issues arise. We could switch the ncov build to it earlier than that, I suppose, to try to shake out any issues before pushing it out to external users.

@huddlej
Copy link
Contributor Author

huddlej commented Jan 16, 2021

Sounds great, @tsibley! Any time Wednesday-Friday of next week should work.

@tsibley tsibley force-pushed the python-base branch 2 times, most recently from 6850d98 to 71fc8bd Compare January 20, 2021 07:10
tsibley added a commit to nextstrain/ncov-ingest that referenced this pull request Jan 22, 2021
This helps us test the new image internally before making it the
default.¹

¹ nextstrain/docker-base#21
@tsibley
Copy link
Member

tsibley commented Jan 22, 2021

@huddlej

Any time Wednesday-Friday of next week should work.

This week was unexpectedly scattered for me due to childcare, so let's plan on the coming week? My thought is to:

and then watch the regular builds next week on

  • Thursday, and
  • Friday

If we don't see any problems, then

  • Merge this PR on Friday/Monday to make it the latest default base image for everyone.
  • git revert 1e88434 a456f26 54ae8da de8f8a8 so ncov tracks latest image again.

It would also still be good to follow up with:

docker-base/Dockerfile

Lines 96 to 101 in 71fc8bd

# Install additional "full" dependencies for augur.
# TODO: these versions should be updated in augur's setup.py to work with the
# pip install nextstrain-augur[full] approach.
RUN pip3 install cvxopt==1.2.4
RUN pip3 install matplotlib==2.2.2
RUN pip3 install seaborn==0.9.0

The current state of Augur has slightly different versions:

https://github.com/nextstrain/augur/blob/867c5e368e5f621528039fe5d417c85e9e66c0f0/setup.py#L59-L63

I don't think I have enough recent context here to know if bumping the Augur deps and cutting a release is all we're talking about, or if there's more to it than that. Can you weigh in?

Alpine Linux uses musl libc instead of glibc which means it cannot use most
pre-built Python wheels for glibc-based Linux distributions [1]. As a result, Python
packages need to be built from source, leading to longer build times and
increased complexity in the Dockerfile for more complex packages. For complex
packages like opencv-python, building from source is especially non-trivial.

This commit ports the existing Alpine-based Dockerfile to a minimal Debian-based
file with pre-installed Python 3 (the official Python Docker image) and removes
dependencies that were originally required to build Python packages from
sources (e.g., cvxopt) where installation from a wheel is now possible.

This commit also removes all explicit references to Python 2 dependencies, as
fauna is compatible enough with Python 3 for downloading to work and sacra was
previously removed.

[1] docker-library/docs#904
@tsibley
Copy link
Member

tsibley commented Feb 1, 2021

(Rebased onto latest master to fix merge conflict before merge.)

@huddlej
Copy link
Contributor Author

huddlej commented Feb 1, 2021

Thank you for pushing this forward with a great plan, @tsibley! The only notable version difference in this dependencies is the cvxopt version. I've tested cvxopt with versions 1.2.X and actually use versions 1.X in the Augur bioconda recipe. I can update the Augur setup.py to reflect this.

Is there a Docker-specific reason why we don't just run pip install .[full] when we install Augur in the image instead of installing each of these dependencies separately? I'd love to get as close as possible to one (or two?) sources of truth for dependencies, so we don't miss updating places like this. If there is a slight extra cost in Docker layer size/turnover from the full pip install, that might be worth the reduced complexity.

@tsibley
Copy link
Member

tsibley commented Feb 2, 2021

I've tested cvxopt with versions 1.2.X and actually use versions 1.X in the Augur bioconda recipe. I can update the Augur setup.py to reflect this.

Excellent!

Is there a Docker-specific reason why we don't just run pip install .[full] when we install Augur in the image instead of installing each of these dependencies separately? I'd love to get as close as possible to one (or two?) sources of truth for dependencies, so we don't miss updating places like this. If there is a slight extra cost in Docker layer size/turnover from the full pip install, that might be worth the reduced complexity.

With the update to Augur's setup.py and this PR for the new base image, I think we should definitely just install .[full].

That original reason Augur's deps were explicitly denormalized into the Dockerfile here was to avoid lengthy image rebuilds. That was done because several of the deps didn't have wheels for Alpine, so had to be compiled from source. With the new image base, we should be able to rely on wheels, so installing the deps every time won't be much of a burden and worth the simplicity of having a single source of truth.

@huddlej
Copy link
Contributor Author

huddlej commented Feb 2, 2021

That was done because several of the deps didn't have wheels for Alpine

Perfect! It's all coming full-circle in my mind. :)

@tsibley tsibley merged commit 65ac391 into master Feb 5, 2021
@tsibley tsibley deleted the python-base branch February 5, 2021 19:00
@tsibley
Copy link
Member

tsibley commented Feb 5, 2021

Making the switch to augur[full] in #30.

Comment on lines 103 to +109
# Install Node deps, build Auspice, and link it into the global search path. A
# fresh install is only ~40 seconds, so we're not worrying about caching these
# as we did the Python deps. Building auspice means we can run it without
# hot-reloading, which is time-consuming and generally unnecessary in the
# container image. Linking is equivalent to an editable Python install and
# used for the same reasons described above.
RUN cd /nextstrain/auspice && npm install && npm run build && npm link

RUN cd /nextstrain/auspice && npm update && npm install && npm run build && npm link
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huddlej I was trying to answer "Why do we run npm update?" (when working on #165) and traced its introduction back to here. I assume it's unlikely, but any recollection why you added it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsibley I'm sorry I don't remember. Since I know very little about the npm ecosystem, it seems likely that I saw this pattern somewhere during work on this PR and copied it while trying to debug build issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries!

tsibley added a commit that referenced this pull request Jun 23, 2023
This was introduced incidentally, and seemingly unintentionally¹, in
"Replace Alpine-based image with a Debian-based image" (ce07dad), but no
one noticed.

`npm update` is unwarranted because it's intended for maintainers of a
package, not downstream user installs, which is the role we have here.
Its effect was to bump the minimum versions in Auspice's package.json to
the latest available (while still respecting SemVer constraints) and
then install all of Auspice's deps into node_modules/.²  That's
unnecessary because we then run `npm install`, which unlike `npm
update`, also runs pre/post-installation steps Auspice includes.

¹ <#21 (comment)>

² Notably, `npm update` was added when the image had npm v5, which
  updates both package.json and package-lock.json.  We currently have npm
  v6, which does the same.  Subsequent versions, e.g. npm v8, stopped
  updating package.json and only update package-lock.json.
tsibley added a commit that referenced this pull request Jun 23, 2023
This was introduced incidentally, and seemingly unintentionally¹, in
"Replace Alpine-based image with a Debian-based image" (ce07dad), but no
one noticed.

`npm update` is unwarranted because it's intended for maintainers of a
package, not downstream user installs, which is the role we have here.
Its effect was to bump the minimum versions in Auspice's package.json to
the latest available (while still respecting SemVer constraints) and
then install all of Auspice's deps into node_modules/.²  That's
unnecessary because we then run `npm install`, which unlike `npm
update`, also runs pre/post-installation steps Auspice includes.

¹ <#21 (comment)>

² Notably, `npm update` was added when the image had npm v5, which
  updates both package.json and package-lock.json.  We currently have npm
  v6, which does the same.  Subsequent versions, e.g. npm v8, stopped
  updating package.json and only update package-lock.json.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants