Size of gcr.io/kubeflow/tensorflow-notebook-* #37

flx42 · 2017-12-18T01:14:25Z

From the README:

We also ship standard docker images that you can use for training Tensorflow models with Jupyter.
gcr.io/kubeflow/tensorflow-notebook-cpu
gcr.io/kubeflow/tensorflow-notebook-gpu
[...] Note that GPU-based image is several gigabytes in size and may take a few minutes to localize.

("localize"?)

They are both large:

$ docker images gcr.io/kubeflow/tensorflow-notebook-gpu:latest
REPOSITORY                                TAG                 IMAGE ID            CREATED             SIZE
gcr.io/kubeflow/tensorflow-notebook-gpu   latest              e68d36c67064        2 weeks ago         7.11GB

$ docker images gcr.io/kubeflow/tensorflow-notebook-cpu:latest
REPOSITORY                                TAG                 IMAGE ID            CREATED             SIZE
gcr.io/kubeflow/tensorflow-notebook-cpu   latest              9cb2a6008740        2 weeks ago         5.17GB

Are the Dockerfiles public for these images? I can probably do a quick PR to improve the size.

You might be interested to look at the improvements I did in the devel-gpu Dockerfile for TensorFlow:
tensorflow/tensorflow#15355

Also, it would be helpful if you could chime in on this RFE:
tensorflow/tensorflow#15284
Maybe we can have a single image with Jupyter+TensorFlow+TensorBoard? That would shrink the other TensorFlow images that are shipped today (e.g. gpu and devel-gpu).

The text was updated successfully, but these errors were encountered:

pineking · 2017-12-18T02:17:23Z

I also think the size can be reduced. BTW: is it possible to push the images to dockerhub instead of gcr.io?

jlewi · 2017-12-18T14:36:15Z

@vishh Are we just using the tensorflow Docker images? I don't see any Dcokerfiles for these notebook images inside google/kubeflow.

flx42 · 2017-12-18T16:00:00Z

No it's not the same, if you do docker history --no-trunc gcr.io/kubeflow/tensorflow-notebook-gpu, you can see it's different from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docker/Dockerfile.devel-gpu or https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docker/Dockerfile.gpu

jlewi · 2017-12-18T17:21:18Z

We'd like to support a bunch of different frameworks e.g.

TensorFlow
xgboost
scikits

Some questions:

Should we provide one fat image with all these libraries or have multiple images?
Are there existing, curated images that we can reuse as opposed to building our own?

aronchick · 2017-12-18T17:25:31Z

Was there something in particular that was an issue with GCR.io?

…

On Mon, Dec 18, 2017 at 9:21 AM Jeremy Lewi ***@***.***> wrote: We'd like to support a bunch of different frameworks e.g. - TensorFlow - xgboost - scikits Some questions: - Should we provide one fat image with all these libraries or have multiple images? - Are there existing, curated images that we can reuse as opposed to building our own? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#37 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADIdcz5ZT3V0JVJi_GKLtIAalLjMac4ks5tBp8OgaJpZM4RE31c> .

yuvipanda · 2017-12-18T18:04:26Z

@aronchick afaict gcr.io doesn't provide a human friendly URL to pass to people, which I've always found annoying for public images.

@jlewi there's some at http://github.com/jupyter/docker-stacks/ (and PRs welcome!) that do get a fair amount of usage.

flx42 · 2017-12-18T18:25:33Z

I think it makes sense to have one "fat" image, it it allows us to keep the other images lean.
This image could target being a development environment for data scientists: Jupyter, TensorFlow, TensorBoard and usual python dependencies.

jlewi · 2017-12-19T23:45:32Z

This is the source for our existing Docker images
https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/example/tensorflow-notebook-image

So everything is public and we should probably move them into Kubeflow.

Using an NVIDIA as the base image for our GPU images makes sense to me.

/cc @flx42

flx42 · 2017-12-20T00:33:02Z

Ok, let's discuss about the size again when it's on this repo.
But I believe it still makes sense to check with the TensorFlow team if a single common Jupyter image can be created.

Remove a lot of bloat - install only the minimal set of packages required to get started with ML. Any packages required can be installed by the user in the notebook itself using pip install/conda install Image size has gone down from 12GB to 3GB for cpu image Having a lot of packages makes it very challenging to maintain them because of version conflicts Run everything as jovyan user - this enables user to run conda install / pip install without requiring sudo Add comments on every step Fixes #668 Fixes #37 Fixes #472

Remove a lot of bloat - install only the minimal set of packages required to get started with ML. Any packages required can be installed by the user in the notebook itself using pip install/conda install Image size has gone down from 12GB to 3GB for cpu image Having a lot of packages makes it very challenging to maintain them because of version conflicts Run everything as jovyan user - this enables user to run conda install / pip install without requiring sudo Add comments on every step Fixes kubeflow#668 Fixes kubeflow#37 Fixes kubeflow#472 Conflicts: components/tensorflow-notebook-image/Dockerfile components/tensorflow-notebook-image/build_image.sh components/tensorflow-notebook-image/releaser/components/workflows.libsonnet

@jlewi

… gcr.io locations (#703) * Refactor tensorflow-notebook-image/Dockerfile (#689) Remove a lot of bloat - install only the minimal set of packages required to get started with ML. Any packages required can be installed by the user in the notebook itself using pip install/conda install Image size has gone down from 12GB to 3GB for cpu image Having a lot of packages makes it very challenging to maintain them because of version conflicts Run everything as jovyan user - this enables user to run conda install / pip install without requiring sudo Add comments on every step Fixes #668 Fixes #37 Fixes #472 Conflicts: components/tensorflow-notebook-image/Dockerfile components/tensorflow-notebook-image/build_image.sh components/tensorflow-notebook-image/releaser/components/workflows.libsonnet * Update various images in kubeflow to kubeflow-images-public (#635) Point them to kubeflow-images-public instead of kubeflow-images-staging Related to #534 /cc @jlewi Conflicts: bootstrap/Makefile bootstrap/README.md * Migrate images to kubeflow-images-public (#695) Related to #534 Conflicts: bootstrap/README.md docs_dev/images.md kubeflow/core/tests/tf-job_test.jsonnet * Update the hub spawner dropdown for latest NB images (#697)

* This project will be used by the folks at GoJek and Google PSO to develop and test feast. Related to kubeflow/testing#254

Remove a lot of bloat - install only the minimal set of packages required to get started with ML. Any packages required can be installed by the user in the notebook itself using pip install/conda install Image size has gone down from 12GB to 3GB for cpu image Having a lot of packages makes it very challenging to maintain them because of version conflicts Run everything as jovyan user - this enables user to run conda install / pip install without requiring sudo Add comments on every step Fixes kubeflow#668 Fixes kubeflow#37 Fixes kubeflow#472

Signed-off-by: Ce Gao <[email protected]>

flx42 changed the title ~~Size~~ Size of gcr.io/kubeflow/tensorflow-notebook-* Dec 18, 2017

jlewi mentioned this issue Dec 20, 2017

Proposal: Official Jupyter Images for Kubeflow #52

Closed

ankushagarwal mentioned this issue Apr 19, 2018

Refactor tensorflow-notebook-image/Dockerfile #689

Merged

k8s-ci-robot closed this as completed in #689 Apr 20, 2018

kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue Mar 5, 2019

Setup permissions on project kf-feast. (kubeflow#37)

d69a5fd

* This project will be used by the folks at GoJek and Google PSO to develop and test feast. Related to kubeflow/testing#254

yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021

Dockerfile: Use alpine as base image (kubeflow#37)

20ec6a0

Signed-off-by: Ce Gao <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Size of gcr.io/kubeflow/tensorflow-notebook-* #37

Size of gcr.io/kubeflow/tensorflow-notebook-* #37

flx42 commented Dec 18, 2017

pineking commented Dec 18, 2017

jlewi commented Dec 18, 2017

flx42 commented Dec 18, 2017

jlewi commented Dec 18, 2017

aronchick commented Dec 18, 2017 via email

yuvipanda commented Dec 18, 2017

flx42 commented Dec 18, 2017

jlewi commented Dec 19, 2017

flx42 commented Dec 20, 2017

Size of gcr.io/kubeflow/tensorflow-notebook-* #37

Size of gcr.io/kubeflow/tensorflow-notebook-* #37

Comments

flx42 commented Dec 18, 2017

pineking commented Dec 18, 2017

jlewi commented Dec 18, 2017

flx42 commented Dec 18, 2017

jlewi commented Dec 18, 2017

aronchick commented Dec 18, 2017 via email

yuvipanda commented Dec 18, 2017

flx42 commented Dec 18, 2017

jlewi commented Dec 19, 2017

flx42 commented Dec 20, 2017