Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add asynchronous downloading to speed up signed URL downloading. #4266

Merged
merged 12 commits into from
Oct 8, 2024

Conversation

jonathanmetzman
Copy link
Collaborator

@jonathanmetzman jonathanmetzman commented Sep 20, 2024

The speed ups are up to 20x on my own machine.
It's even faster with one core than 16 parallel processes.

src/Pipfile Outdated Show resolved Hide resolved
@jonathanmetzman
Copy link
Collaborator Author

TODO: Do this for other http methods.

@vitorguidi
Copy link
Collaborator

The speed ups are up to 20x on my own machine. It's even faster with one core than 16 parallel processes.

Do we have any idea of how long the downloading step takes from past profiling data, on fuzz task? (back when those profiling images were around)

],
"index": "pypi",
"markers": "python_version >= '3.8'",
"version": "==3.10.5"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these dep bumps supported in our platforms?

There should be complaints from

print(f'Did not find package for platform: {pip_platform}')
during butler deploy, if not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also need to rebuild docker, since httpaio is a new dep with C extensions to access async syscalls

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we're not using the C extensions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'll know what to look out for if I see a problem.

@jonathanmetzman
Copy link
Collaborator Author

The speed ups are up to 20x on my own machine. It's even faster with one core than 16 parallel processes.

Do we have any idea of how long the downloading step takes from past profiling data, on fuzz task? (back when those profiling images were around)

We got rid of profiling before we switched to this new method of downloading, but seeing as the new method can take up to 15 minutes, I think the savings will be substantial.

src/Pipfile Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Show resolved Hide resolved
src/clusterfuzz/_internal/system/fast_http.py Outdated Show resolved Hide resolved
@jonathanmetzman
Copy link
Collaborator Author

Please take a another look.

@jonathanmetzman jonathanmetzman merged commit 2ca47ed into master Oct 8, 2024
7 checks passed
@jonathanmetzman jonathanmetzman deleted the async2 branch October 8, 2024 17:59
jonathanmetzman added a commit that referenced this pull request Oct 8, 2024
…ng. (#4266)"

This reverts commit 2ca47ed.

We should test this first.
jonathanmetzman added a commit that referenced this pull request Oct 8, 2024
#4302)

…ng. (#4266)"

This reverts commit 2ca47ed.

We should test this first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants