Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit concurrency #329

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Limit concurrency #329

wants to merge 10 commits into from

Conversation

hayesgb
Copy link
Collaborator

@hayesgb hayesgb commented Jul 23, 2022

Picking up of #288

Tom Augspurger and others added 7 commits November 9, 2021 10:33
This adds a new keyword to AzureBlobFileSystem to limit the number of
concurrent connectiouns. See pangeo-forge/pangeo-forge-recipes#227 (comment)
for some motivation. In that situation, we had a single FileSystem
instance that was generating many concurrent write requests through
`.pipe`. So many, that we were seeing memory issues from creating all
the BlobClient connections simultaneously.

This adds an asyncio.Semaphore instance to the AzureBlobFilesSytem that
controls the number of concurrent BlobClient connections. The default of
None is backwards-compatible (no limit)
@TomAugspurger
Copy link
Contributor

Thanks for picking this up!

One thing I've learned about since starting this PR, azure-storage-blob already uses the max_concurrency keyword for something that's slightly different. The BlobClient.upload_blob and BlobClient.download_blob use max_concurrency to control the number of threads / concurrent tasks to run. It's similar, but slightly difference since it controls concurrency / parallelism on a single blob operation. This PR was targeting operations on many blobs.

We might want to consider a different keyword (max_clients? max_blob_clients?) so that we can pass through max_concurrency for parallelism on single-blob operations.

@hayesgb
Copy link
Collaborator Author

hayesgb commented Jul 25, 2022

Hey @TomAugspurger -- That sounds like a great idea.

Unit tests were passing in my local.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants