Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the Dockerfile to reduce the size of the image. (50.76 GB to 19.04 GB) #1976

Merged
merged 6 commits into from
Feb 18, 2024
Merged

Conversation

jim60105
Copy link
Contributor

@jim60105 jim60105 commented Feb 18, 2024

2024-02-18-085115

The current Dockerfile is excessively large, containing unnecessary files within the base image.
By substituting the base image with python:3.10-slim and implementing certain multi-stage techniques, I managed to decrease the image size.

The image has been installed with these package versions

  • torch==2.1.2
  • torchvision==0.16.2
  • xformers==0.0.23.post1
  • tensorflow[and-cuda]==2.14.0

I've tested with LoRA training with fp8 base option + xformers and works well!

Additionally, I made an attempt to write a CI image build; however, the size of the build still exceeds the capacity of GitHub's free runners, so I revert the commit. If GitHub increases runner disk size in the future or if you wish to use a Self-hosted runner, you can get the commit back.

Too big to run on GitHub free runner.
Error: no space left on device

This reverts commit fc029d7.

Signed-off-by: 陳鈞 <[email protected]>
…:3.10-slim

- Change the base image from `python:3.10` to `python:3.10-slim`
- Expose ports `7860` and `6006` in the Dockerfile

Signed-off-by: 陳鈞 <[email protected]>
- Switch to a slim variant of the Python 3.10 Docker image for the build stage.
- Introduce a conditional installation of `pillow-simd` replacing `pillow`, specific to x86 architecture.
- Add required dependencies for `pillow-simd` installation, along with cleanup commands to remove unnecessary package lists after installation.
- Update runtime dependencies by adding `libtcl8.6` and `libtk8.6` to the final Docker image.

Signed-off-by: 陳鈞 <[email protected]>
@bmaltais bmaltais changed the base branch from master to dev February 18, 2024 15:33
@bmaltais
Copy link
Owner

Look like a great commit... I hope it will not cause issues for current users... but I will accept it. I don't use docker so I can't really test it... fingers crossed.

@bmaltais bmaltais merged commit 78e2df1 into bmaltais:dev Feb 18, 2024
1 check failed
@jim60105
Copy link
Contributor Author

I think there is no breaking changes in docker side. git pull and then docker compose down && docker compose up -build -d it will works.

I have taken care of the file permission in the volume, user id are the same as the original image, so there is no need to make any manual modifications.

@bmaltais
Copy link
Owner

bmaltais commented Mar 2, 2024

@jim60105 I am seriously reqorking how kohya gui is structured. I have gotten rid of all sd-scripts file copy in my repo and I am now cloning a copy of kohya sd-scripts in the repo to run his sode... no more maintenance of his code in mine.

BUT, docker does not execure the setup_linux.py code that would clone his repo... I hope you can add the missing glue to the docker file to make it properly clone a copy of the appropriate sd-scripts release during the container build.

The refactored code is in the dev-pure branch.

Looking forward to a PR

@jim60105
Copy link
Contributor Author

jim60105 commented Mar 2, 2024

I can handle this matter.

However, have you considered using git submodule to handle this situation?
I saw that you specified the version number of kohya-ss/sd-scripts in .sd-scripts-release.
In my opinion, git submodule is a proper way to include another repository in a git repository.
Additionally, you can specify the commit of the submodule as per your requirements.

If you use the git submodule solution, users will get the submodule during the git clone process, and there is no need to modify dockerfile for this purpose.
(Please note that there is a step to copy the entire working directory during docker building.)

@bmaltais
Copy link
Owner

bmaltais commented Mar 3, 2024

Hi, I have but submodules add a level of command complexity for users when cloning and updating the main repo.

To fix that more coding and handling need to be done using code similar to what I have to use to handle but with different syntax.

overall I prefer to keep the user interaction with git as simple as possible and manage the cloning of the sub repo in case… unless you can convince doing submodules can be simpler for users and code update of as-script perspective.

My past experience with submodules has been hell and I have avoided it ever since.

@jim60105
Copy link
Contributor Author

jim60105 commented Mar 3, 2024

The "a level of command complexity for users" is simply need to do git clone with --recursive.
From as-script perspective, executing git submodule update --init --recursive in the root of the working directory is much simpler compared to manually cd then git clone or git checkout.
You're reinventing git submodule.

I think adhering to standard practices is highly crucial for open source projects.
This ensures that experienced developers can quickly adapt to the situation without any surprises.

In my experience, git submodules have not caused any particular problems for my open source projects or professional projects. It is assumed that the project manager is familiar with git and knows how to operate submodules.

I usually write this in the GitHub README, for your reference:

Important

Clone the Git repository recursively to include submodules:
git clone --recursive https://github.com/bmaltais/kohya_ss.git

@bmaltais
Copy link
Owner

bmaltais commented Mar 3, 2024

@jim60105 Let's mode the discussion to #2029

I will try using the submodule and evaluate the impact vs using a clone approach to a .gitignore folder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants