-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(job_attachments): enhance handling S3 timeout errors and BotoCoreError #206
Merged
+301
−8
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gahyusuh
force-pushed
the
gahyusuh/ja_s3_timeout
branch
from
March 12, 2024 18:06
52018e7
to
0f230bb
Compare
marofke
approved these changes
Mar 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work here recreating the error and making a much better UX from timeouts!
LGTM! 🚢 🦈
gahyusuh
force-pushed
the
gahyusuh/ja_s3_timeout
branch
2 times, most recently
from
March 15, 2024 17:23
a0eee74
to
e4c309d
Compare
…eError Improve error handling for S3 requests by - adding "retries" configuration to the S3 client - adding BotoCoreError handling to cover S3 timeout errors (e.g., ReadTimeoutError, ConnectTimeoutError) Signed-off-by: Gahyun Suh <[email protected]>
gahyusuh
force-pushed
the
gahyusuh/ja_s3_timeout
branch
from
March 18, 2024 13:58
e4c309d
to
f083623
Compare
amarsjac
approved these changes
Mar 19, 2024
Merged
baxeaz
pushed a commit
that referenced
this pull request
Mar 21, 2024
…eError (#206) Improve error handling for S3 requests by - adding "retries" configuration to the S3 client - adding BotoCoreError handling to cover S3 timeout errors (e.g., ReadTimeoutError, ConnectTimeoutError) Signed-off-by: Gahyun Suh <[email protected]> Signed-off-by: Brian Axelson <[email protected]>
npmacl
pushed a commit
that referenced
this pull request
Mar 21, 2024
…eError (#206) Improve error handling for S3 requests by - adding "retries" configuration to the S3 client - adding BotoCoreError handling to cover S3 timeout errors (e.g., ReadTimeoutError, ConnectTimeoutError) Signed-off-by: Gahyun Suh <[email protected]>
baxeaz
added a commit
that referenced
this pull request
Mar 22, 2024
* Switch to running deadline_vfs as os_user Signed-off-by: Brian Axelson <[email protected]> * feat(job_attachments): enhance handling S3 timeout errors and BotoCoreError (#206) Improve error handling for S3 requests by - adding "retries" configuration to the S3 client - adding BotoCoreError handling to cover S3 timeout errors (e.g., ReadTimeoutError, ConnectTimeoutError) Signed-off-by: Gahyun Suh <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * fix(job_attachments): Use files' last modification time to identify output files to be synced (#211) Signed-off-by: Gahyun Suh <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * chore(deps): update python-semantic-release requirement (#216) Updates the requirements on [python-semantic-release](https://github.com/python-semantic-release/python-semantic-release) to permit the latest version. - [Release notes](https://github.com/python-semantic-release/python-semantic-release/releases) - [Changelog](https://github.com/python-semantic-release/python-semantic-release/blob/master/CHANGELOG.md) - [Commits](python-semantic-release/python-semantic-release@v8.7.0...v9.2.2) --- updated-dependencies: - dependency-name: python-semantic-release dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Brian Axelson <[email protected]> * chore(release): 0.41.0 (#217) Signed-off-by: client-software-ci <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * chore(deps): update coverage[toml] requirement from ~=7.2 to ~=7.4 (#156) Updates the requirements on [coverage[toml]](https://github.com/nedbat/coveragepy) to permit the latest version. - [Release notes](https://github.com/nedbat/coveragepy/releases) - [Changelog](https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst) - [Commits](nedbat/coveragepy@7.3.0...7.4.0) --- updated-dependencies: - dependency-name: coverage[toml] dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Brian Axelson <[email protected]> * fix: Make StorageProfileOperatingSystemFamily enum case-insensitive Signed-off-by: Caden Marofke <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * ci: add gpg signing of build artifacts (#218) Signed-off-by: Charles Moore <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * feat!: prep for rootPathFormat becoming ALL UPPERS (#222) ** BREAKING CHANGE ** * The PathFormat enum's values went from all lowercase to all uppercase * The source_path_root in the path mapping rules return value from sync_inputs went from all lowercase to all uppercase Signed-off-by: Morgan Epp <[email protected]> Signed-off-by: Brian Axelson <[email protected]> * CR Feedback Signed-off-by: Brian Axelson <[email protected]> * Cleaning up a few more 'executing the job' cases Signed-off-by: Brian Axelson <[email protected]> --------- Signed-off-by: Brian Axelson <[email protected]> Signed-off-by: Gahyun Suh <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: client-software-ci <[email protected]> Signed-off-by: Caden Marofke <[email protected]> Signed-off-by: Charles Moore <[email protected]> Signed-off-by: Morgan Epp <[email protected]> Co-authored-by: Gahyun Suh <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: client-software-ci <[email protected]> Co-authored-by: Caden Marofke <[email protected]> Co-authored-by: Charles Moore <[email protected]> Co-authored-by: Morgan Epp <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve error handling for S3 requests by
Although not related to the main changes here, made an additional improvement:
Update the script
upload_cancel_test.py
to create large files in smaller sized chunks to prevent MemoryError.What was the problem/requirement? (What/Why)
There was an issue where a long and non-user-friendly error stack trace was printed to the console during file uploads when trying to submit a job over a slow internet bandwidth. (it was confirmed that they could GET and DELETE objects in their bucket with permissions, so it seems purely a bandwidth-related issue.) The reported timeout error was ReadTimeoutError, a botocore exception. While we are handling ClientError from botocore, we lacked handling for other errors like ReadTimeoutError.
What was the solution? (How)
BotoCoreError
: This botocore exception is a base class for many errors related to boto3 calls, which typically occur during the preparation phase of AWS service calls due to issues like network connection problems, or authentication failures. I have added more user-friendly error messages to better communicate these issues to users.What is the impact of this change?
More robust error handling for S3 requests, providing clear and non-cumbersome error messages to users when S3 requests failed with BotoCoreError.
How was this change tested?
Was this change documented?
No.
Is this a breaking change?
No.