Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add worker config check_unfulfilled_deps. #2189

Merged
merged 5 commits into from
Jul 26, 2017

Conversation

riga
Copy link
Contributor

@riga riga commented Jul 21, 2017

Description

When a worker is about to run a task, it checks whether its dependencies are complete. This PR adds a configuration parameter which makes that behavior optional.

Motivation and Context

In our workflow, we use remote targets that are stored in various locations on the WLCG. Some of our tasks have O(1000) dependencies with O(10) output targets each. When running these tasks, the dependency tree is built by checking if the targets exist, which results in network load. But before a worker actually runs a task, it checks its dependencies again. By skipping this, we could significantly reduce the network load. Therefore, I added a [worker] config parameter check_unfulfilled_deps to alter this behavior.

@mention-bot
Copy link

@riga, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ChrisBeaumont, @steenzout and @daveFNbuck to be potential reviewers.

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to approve. Just curious, when do you need this? Is your complete checks really slow?

luigi/worker.py Outdated
@@ -910,7 +914,8 @@ def _create_task_process(self, task):
return TaskProcess(
task, self._id, self._task_result_queue, reporter,
use_multiprocessing=bool(self.worker_processes > 1),
worker_timeout=self._config.timeout
worker_timeout=self._config.timeout,
check_unfulfilled_deps=self._config.check_unfulfilled_deps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a trailing comma here? Next line addition will look nicer in the diffs. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done.

@riga
Copy link
Contributor Author

riga commented Jul 24, 2017

Kind of. Depending on the current load at some computing and storage facilities, the completeness checks can be slow. But usually, the number of requests is the bottle neck.

A typical use case for this feature is a tree of M tasks which all have the same N dependencies. When the completeness check of those N deps is load-intensive (say 1k network requests per dep), this N x 1k requests are executed once for tree building and repeated M times which we'd like to skip by disabling the check. Of course, we have to make sure that the targets of the deps don't change during the processing of the tree.

btw: Do you know what's wrong with travis?

@Tarrasch
Copy link
Contributor

I think this patch looks reasonable. Two thoughts:

  • Would it make sense to include a warning in docs saying that one should only use it when the bottleneck is known to be the complete checks?
  • Would it make sense to only enable this per tasks rather than a Worker?

btw: Do you know what's wrong with travis?

I think they changed Travis, and we having magic caused failures.

@riga
Copy link
Contributor Author

riga commented Jul 25, 2017

Would it make sense to include a warning in docs saying that one should only use it when the bottleneck is known to be the complete checks?

Yes, that makes sense. The more docs the better ;) I'll add one or two sentences.

Would it make sense to only enable this per tasks rather than a Worker?

In our use case, we would use the worker config, but other users might prefer a per-task setting. What about using both? Of course, the task setting would have precedence.

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This all looks good to me. There's no need to worry about the per-task use case.

However, I do kindly request to add at least one test case before we merge this. :)

@Tarrasch
Copy link
Contributor

By the way, since I merged #2190, maybe the build issues are resolved. :)

@dlstadther
Copy link
Collaborator

@Tarrasch even after manually restarting Travis, build continues to exit

@Tarrasch
Copy link
Contributor

@Tarrasch even after manually restarting Travis, build continues to exit

@dlstadther, that's expected if you didn't rebase upon master. As the Travis reads the config from the checked in .travis.yml file.

@riga
Copy link
Contributor Author

riga commented Jul 26, 2017

However, I do kindly request to add at least one test case before we merge this. :)

Done :)

There's still one failing travis job, but this is unrelated to my code, right?

@Tarrasch Tarrasch merged commit 4fb0886 into spotify:master Jul 26, 2017
@Tarrasch
Copy link
Contributor

Thanks @riga :)

This was referenced Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants