-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: s3_client.download_file* multipart download #131
Comments
download_file and download_fileobj pretty much reads 4096 bytes, writes to file then reads another 4096 bytes. Am pretty sure S3 doesn't support multipart downloads, correct me if I'm wrong. Does this do what you want? |
I believe so; the docs list "Uploading/downloading a file in parallel" as a feature: |
Ok, have a go with download_file/fileobj and see if it does what you need. |
Having done some reading, and some work done by @thehesiod it looks like in its current state, get_object does indeed download the entire file as it attempts to verify md5 sums. We can improve download_file/obj by being more like s3transfer and using get_objects Range options to download multiple parts of the file. Will look into this in the next coming week or so. |
Also because aws API calls are signed, I believe the only way to upload in parts would be using multipart upload. |
Yup, thats what I came to as well. Need to dig around in s3transefer, as the s3.upload/download_file/obj methods have some logic in them to choose between put_object and multipart upload, as well as get_object vs multiple get_object with range |
btw a big issue we have with multipart uploads is that the ETag becomes rather useless unless you put in the metadata what the chunk sizes were. |
@thehesiod I'll add that in when I come to redoing this part |
Any updates or ETA for multipart downloads? Eagerly waiting for this one. |
Sorry been away for quite a while, no eta on multipart downloads as of yet. |
Thanks for your excellent library! The docs mention the following patches to s3transfer:
For performance reasons, it would be fantastic if the download_file* methods also did a custom multipart download, the same way the upload_file* methods do. Is this in the roadmap?
The text was updated successfully, but these errors were encountered: