You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
There are many scenarios a user may clear running TaskInstances. The current behavior is to set the TaskInstance state to SHUTDOWN and then SIGTERM is sent to the task, causing it to fail and its on_failure_callback is called. This can be noisy. It usually makes more sense to silently clear + terminate the running instance and retry. Here are some example scenarios:
A long running task needs to pick up a code change. The user makes the change and clears the task. The user probably doesn't want on_failure_callback to be called when he clears it. He just wants the task to be restarted with his code change, gracefully.
A task failed for some external reason. The user fixed the underlying issue and cleared the failed task to retry. Soon after he cleared it, he realizes that the fix he introduced is not good enough so he introduced another fix. Then he cleared the task again while it's still running. This kills the task and makes it fail. A better behavior is to silently kill the task and retry gracefully.
The point I'm making is that clearing a running TaskInstance is different from marking a running TaskInstance failed. At the moment, both operations do the same thing: the TaskInstance is first set to SHUTDOWN and then FAILED.
One suggestion is to introduce a new State called CLEARED_WHEN_RUNNING. As the name suggests, a TaskInstance should be set to this state when it's cleared while running. Most of the places can handle this state the same way SHUTDOWN is handled, except in TaskInstance.is_eligible_to_retry, where it should always be treated as eligible for retry.
Are you willing to submit a PR?
Yes!
The text was updated successfully, but these errors were encountered:
closes: #16680
This PR makes sure that when a user clears a running task, the task does not fail. Instead it is killed and retried gracefully.
This is done by introducing a new State called RESTARTING. As the name suggests, a TaskInstance is set to this state when it's cleared while running. Most of the places handles RESTARTING the same way SHUTDOWN is handled, except in TaskInstance.is_eligible_to_retry, where it is always be treated as eligible for retry.
Description
There are many scenarios a user may clear running TaskInstances. The current behavior is to set the
TaskInstance
state toSHUTDOWN
and thenSIGTERM
is sent to the task, causing it to fail and itson_failure_callback
is called. This can be noisy. It usually makes more sense to silently clear + terminate the running instance and retry. Here are some example scenarios:on_failure_callback
to be called when he clears it. He just wants the task to be restarted with his code change, gracefully.The point I'm making is that clearing a running
TaskInstance
is different from marking a runningTaskInstance
failed. At the moment, both operations do the same thing: theTaskInstance
is first set toSHUTDOWN
and thenFAILED
.One suggestion is to introduce a new
State
calledCLEARED_WHEN_RUNNING
. As the name suggests, aTaskInstance
should be set to this state when it's cleared while running. Most of the places can handle this state the same waySHUTDOWN
is handled, except inTaskInstance.is_eligible_to_retry
, where it should always be treated as eligible for retry.Are you willing to submit a PR?
Yes!
The text was updated successfully, but these errors were encountered: