You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a bug in the multi-part upload mechanism. The entire buffer is uploaded as a 'part' every time the file is written to. I think this is because the temporary file (self.file) is not being truncated when _flush_write_buffer() is called.
Once the temporary buffer is larger than AWS_S3_FILE_BUFFER_SIZE, every subsequent write uploads the WHOLE file as a 'part'.
Here's an example:
pipenv run python demo.py
INFO:root:Size of 'test-file.txt' is 10485760 bytes. (10 megabytes)
INFO:root:Writing to <S3Boto3StorageFile: test-file.txt> in chunks of 1048576 bytes (1 megabyte).
INFO:root:Upload complete, checking size of file in S3 Bucket
INFO:root:Size of 's3://kx-tom-misc-test-bucket/test-file.txt' is 47185920 bytes. (45 megabytes)
I uploaded a 10 megabyte file in 1 megabyte chunks, and ended up with a 45 megabyte file in S3.
Instead of 2 parts of 5 megabytes each, I have parts sized 5 + 6 + 7 + 8 + 9 + 10 == 45
There is a bug in the multi-part upload mechanism. The entire buffer is uploaded as a 'part' every time the file is written to. I think this is because the temporary file (
self.file
) is not being truncated when_flush_write_buffer()
is called.Once the temporary buffer is larger than
AWS_S3_FILE_BUFFER_SIZE
, every subsequent write uploads the WHOLE file as a 'part'.Here's an example:
I uploaded a 10 megabyte file in 1 megabyte chunks, and ended up with a 45 megabyte file in S3.
Instead of 2 parts of 5 megabytes each, I have parts sized
5 + 6 + 7 + 8 + 9 + 10 == 45
Code for demo at: https://gist.github.com/tveastman/9d15076da4f4f0646c9ce4b0006be616
The text was updated successfully, but these errors were encountered: