-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel BigQuery job if block_until_done call times out or is interrupted #1699
Cancel BigQuery job if block_until_done call times out or is interrupted #1699
Conversation
Hi @codyjlin. Thanks for your PR. I'm waiting for a feast-dev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov Report
@@ Coverage Diff @@
## master #1699 +/- ##
==========================================
+ Coverage 83.32% 83.48% +0.16%
==========================================
Files 76 76
Lines 6794 6795 +1
==========================================
+ Hits 5661 5673 +12
+ Misses 1133 1122 -11
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
/kind bug |
if the notebook server/kernel crashes or is shutdown or is interrupted, that doesn't count as a keyboard interrupt right? |
ef1640d
to
4247379
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch about KeyboardInterrupt
🙌
LGTM
Why did you change everything ? I feel that the previous commit was much better ... 🤔 |
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
a475530
to
1e7d852
Compare
…el-bq-job-on-keyboard-interrupt
Signed-off-by: Cody Lin <[email protected]>
_wait_until_done(job_id=job_id) | ||
try: | ||
_wait_until_done(job_id=job_id) | ||
except KeyboardInterrupt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, would it make more sense to have this in a finally
block instead of a except KeyboardInterrupt
? that may also handle sigterm/sigkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. We could have a finally statement that send an asynchronous call to cancel the BQ job in case something goes wrong 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the finally
clause to catch other needs for cancellation (including timeout and KeyboardInterrupt
).
…ellation cases Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
Signed-off-by: Cody Lin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Thanks for putting the effort 🙏
@codyjlin just need to fix lint, and then we should be god to go! |
Signed-off-by: Cody Lin <[email protected]>
Ah sorry, should have ran lint locally first. Thanks @achals and @MattDelac for all the comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, codyjlin, MattDelac The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ted (feast-dev#1699) * Cancel job if to_bigquery is cancelled by user Signed-off-by: Cody Lin <[email protected]> * cancel job in _upload_entity_df_into_bq as well Signed-off-by: Cody Lin <[email protected]> * Fix _is_done logic? Signed-off-by: Cody Lin <[email protected]> * make cancel job code more readable Signed-off-by: Cody Lin <[email protected]> * move KeyboardInterrupt catch outside retry logic; fix retry logic Signed-off-by: Cody Lin <[email protected]> * make block_until_done public; add custom exception for BQJobStillRunning Signed-off-by: Cody Lin <[email protected]> * fix retry logic to catch specific exception Signed-off-by: Cody Lin <[email protected]> * Make retry params configurable; use finally clause to catch more cancellation cases Signed-off-by: Cody Lin <[email protected]> * Modify docstring Signed-off-by: Cody Lin <[email protected]> * Typo in docstring Signed-off-by: Cody Lin <[email protected]> * Fix lint Signed-off-by: Cody Lin <[email protected]> Signed-off-by: CS <[email protected]>
Signed-off-by: Cody Lin [email protected]
What this PR does / why we need it:
If a user decides to abort a
to_bigquery
run (like via actrl-c
), the job should also be cancelled. The PR also fixes the retry logic of getting the bq_job state, and raises a better exception on timeout.Which issue(s) this PR fixes:
(Not large enough for issue)
Does this PR introduce a user-facing change?: