Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream reset on retry #284

Closed
fboismenu opened this issue Oct 26, 2021 · 1 comment · Fixed by #294
Closed

Stream reset on retry #284

fboismenu opened this issue Oct 26, 2021 · 1 comment · Fixed by #294
Assignees
Labels
api: storage Issues related to the googleapis/google-resumable-media-python API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@fboismenu
Copy link

Hello there,

I am using the library, indirectly via python-storage, in maybe an unorthodox way: I am trying to implement a download timeout via a custom http adapter and response.
So my response object is throwing pro-actively timeouts (common.InvalidResponse with http.client.REQUEST_TIMEOUT),
which is forcibly triggering the retry logic in the Download class:

All goes well, but one thing: the final results content data from both the partial content of prematurely aborted request and the final and complete data of the second request.
It happens because both dataset are accumulated in the _stream above.

Do you agree it's bug and that on retry the _stream shall be resetted (e.g. seek(0))? Do I miss a use case where it makes sense to accumulated data, or we were just assuming data would be empty anyway on exception?

If I did not miss anything I'll submit a PR to fix it.

Many thanks in advance.
Fred

@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/google-resumable-media-python API. label Oct 26, 2021
fboismenu pushed a commit to fboismenu/google-resumable-media-python that referenced this issue Oct 26, 2021
fboismenu pushed a commit to fboismenu/google-resumable-media-python that referenced this issue Oct 26, 2021
Fixes Issue googleapis#284.

Reset Download _stream on retry to avoid accumulating data from
previous failed request.
@BenWhitehead BenWhitehead added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Oct 27, 2021
@andrewsg
Copy link
Contributor

Hi, thanks for your report.

I'm unclear on whether we can trust that _stream will always start a download at the zero position, or for that matter whether it will always be seekable in the first place. This issue hasn't come up for us before because, as you're probably aware given that you're working around it now, by default requests library timeouts only trigger before the first byte is sent, so therefore retries based on timeouts never trigger mid-stream unless they are overridden by a system like yours.

Before we can really verify that this change will be safe we'll need to update our retry conformance test suite internally, so we'll start looking at that soon. This might take a bit of time as we improve support for complex failure cases in our test emulator.

@cojenco cojenco added the status: investigating The issue is under investigation, which is determined to be non-trivial. label Nov 9, 2021
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/google-resumable-media-python API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
5 participants