Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize pull_images command #12423

Closed

Conversation

mschwager
Copy link
Contributor

I noticed that running pull_images takes quite a while on first run. This PR parallelizes the pull_images command in the same way as download_corpora.

@jonathanmetzman
Copy link
Contributor

I noticed that running pull_images takes quite a while on first run. This PR parallelizes the pull_images command in the same way as download_corpora.

Can you do a test on how much this improves things? And maybe hide this behind a flag?

I did (what I think, but is unmeasured) slightly better version of this for pushing, that takes into account dependencies (e.g. don't push base-image and base-clang at the same time, since base-clang inherits from base-image.

@mschwager
Copy link
Contributor Author

Hey! Sorry, I won't have time to look back into this for at least another couple months.

The runtime improvement wasn't too great because docker pull mostly saturates the network anyway. There might be some slight improvements when one docker pull thread is unpacking an already downloaded image and another thread is actively downloading an image. It was definitely a bit quick on my machine, but maybe 10-20%, nothing substantial. And this is for an infrequently run command.

Either way, feel free to close this if it's not worth it! 👍

@oliverchang oliverchang closed this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants