-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per Task Retry-Policy #1791
Per Task Retry-Policy #1791
Conversation
… this patch, luigi is able to supports defining retry-policy per task. disable-num-failures config name is depreceated and it is retry_count now.
| disable_hard_timeout | scheduler | | ||
+------------------------+-----------+ | ||
| disable_window_seconds | scheduler | | ||
+------------------------+-----------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we shouldn't have this information here as it's already somewhere else in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include a part about what is in retry policy and what they can use per task,what do you think about it?
Can you also add a test case with with dynamic dependencies? It should be super easy, and it immediately shows if it's necessary to thread along the edit: Oh, I see that you actually have no tests that sets up a task, sets the configuration and then checks that it behaves as expected. You should and this for sure. You basically just need to copy paste your included example but make it into a testcase. And while at it make one version with |
…mple. PerTaskRetryPolicyBehaviorTest has been added in worker_test. unittest.main() part is removed.
@Tarrasch, I added multiple test cases in
This is because of luigi task adding order. Luigi adds a dependency task indirectly with its parent task first. So we need to send |
def setUp(self): | ||
self.per_task_retry_count = 2 | ||
self.default_retry_count = 1 | ||
self.sch = Scheduler(retry_delay=0.1, retry_count=self.default_retry_count, prune_on_get_work=True, record_task_history=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to be so explicit and say record_task_history=False
. But it's ok. :)
@javrasya, I still think |
Implementation without deps_retry_policy_dicts
…class is now extended from LuigiTestCase. DbTaskHistoryTest is fixed after changes including adding retry_policy which is not optional parameter for scheduler Task.
And this is now finally in!! :) |
Description
Defining retry-policy per task as it is defined in #1073
Motivation and Context
It may be required to have different retry-policy for different tasks according to their priority and resource usage. Assume that, a task is retrieving data from Hive which responses in long time and other one is retrieving data from RDBM which responses in short time. I may want Hive task not to retry many times and not to drain all my resource again an again. But, retrying RDBM task many times may not be problem for me. Assume that, I have some network issue for a while and I should be able to give its retry-count more than Hive task. So I would need to define retry-policy at task level.
This fixes the issue defined in #1073
Have you tested this? If so, how?
What is added?