-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pluggable DependencyResolvers #3111
Conversation
… when trying to render a DependencyException to a str, with tid = None not a str - the .join fails. this should be a separate tested/bugfix PR.
Thank you for taking an initial stab at this, @benclifford! As always, you've done a great job here thus far. (Linking this to #3108). For import parsl
from parsl import python_app
from pathlib import Path
parsl.load()
@python_app
def job1():
return {"directory": Path.cwd()}
@python_app
def job2(t: tuple[Path, str]):
return Path(t[0], t[1])
job2((job1()["directory"], "hello")).result() # PosixPath('/home/rosen/test/hello') In order for this to be practically useful, it would be ideal to have essentially the same support for other Python data structures. Naturally, the procedure for a |
I don't think there's any deep problem with values vs keys here: they're pairs of values-that-could-be-Futures that only get different meaning because things like the |
Good point. So, with that being said, what should be the next course of action here to get it to the finish line? Is this something that you would like a hand on, or will you see it through when you get some time to do so? |
@Andrew-S-Rosen there's a bunch of grunt work that is implementing, for each data type that this behaviour should work on, the two singledispatch methods and a test case - that's not really high priority for me to work on, so implementing some of those would be a good thing do. |
Makes sense. Happy to take a stab at it when I get a spare moment. |
i realised that my implementation of tuple unwrap recreates every tuple, not only tuples that have futures in them - |
I noticed this as well but wasn't immediately sure how to avoid that. |
Note to self: There are two main aspects left here to address.
|
@benclifford --- what should be done to continue this PR? Is it a need to find out how to not have it call the resolution every time? |
that, and this coming near the top of my attention stack... |
note to self, if this demo doesn't already do this: after some tossing round of ideas with @svandenhaute, I think possibly also join app end-results could be resolved this way. I suspect this PR doesn't actually do that but I think it's more consistent to do so and opens up some return value possibilities with less concurrency but less boilerplate. |
Is this good to go, @benclifford? :) |
I want to do a bit more tidyup around |
see #3445 for:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing large jumps out at me, but I also note that we appear to be duplicating or recreating objects in the deep path. I wonder if that will prove to be a memory (if not strictly performance) concern for heavier use-cases. Specifically, this seems like an issue:
type_ = type(iterable)
return type_(map(deep_traverse_to_unwrap, iterable))
But I also don't think this is something to tackle until it becomes an issue. "YAGNI" and lazy evaluation being top of mind.
So, looks good, with some inline comments and suggestions if you'd like to follow up on them. (In particular, I do think the tests should be fleshed out, but I'll leave that to y'all's discretion.)
When Parsl examines the arguments to an app, it uses a `DependencyResolver`. | ||
The default `DependencyResolver` will cause Parsl to wait for | ||
``concurrent.futures.Future`` instances (including `AppFuture` and | ||
`DataFuture`), and pass through other arguments without waiting. | ||
|
||
This behaviour is pluggable: Parsl comes with another dependency resolver, | ||
`DEEP_DEPENDENCY_RESOLVER` which knows about futures contained with structures | ||
such as tuples, lists, sets and dicts. | ||
|
||
This plugin interface might be used to interface other task-like or future-like | ||
objects to the Parsl dependency mechanism, by describing how they can be | ||
interpreted as a Future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text is accurate, and points to the right place, but as a "documentation consumer," I find myself without a proper mental model for what this looks like. Would an example implementation be an undue burden to place here? Or at the end of one of the links?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on a presentation for this for next Tuesday so I'll try to use the preparation for that as a way to get my head around more introductory material.
self.dependency_resolver = self.config.dependency_resolver if self.config.dependency_resolver is not None \ | ||
else SHALLOW_DEPENDENCY_RESOLVER | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies that self.dependency_resolver
is a required attribute. Is there utility in making it required at the configuration as well, rather than an implied requirement? That is, moving this conditional into config, and instead either trusting the config object, or asserting here? Perhaps something like:
class Config(...):
def __init__(
...
dependency_resolver: Optional[DependencyResolver],
...
):
if dependency_resolver is None:
dependency_resolver = SHALLOW_DEPENDENCY_RESOLVER
self.dependency_resolver = dependency_resolver
Mypy may complain with that particular construction and not-None
ness (so fiddle!), but the point is that the config object is explicit as to what dependency resolver is in use.
Functionally a wash, I think (so I won't be fussed about this), but thinking in terms of overall clarity for when someone is poking at the REPL or CLI.
def local_config(): | ||
return Config(dependency_resolver=DEEP_DEPENDENCY_RESOLVER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good deal; this appears to test the majority of pathways for the deep resolver. But I think we should implement a similar set of tests for the shallow variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current tests in this PR go through the whole Parsl machinery -- good! But an ancillary set of unit tests that verify strictly this class in isolation would be a good value-add. That is, this is a decently isolated class that doesn't depend on Parsl, so it's functionality could be verified independently of the rest of the infrastructure.
(Note that we've only recently added the tests/unit/
directory, so this would be a good second addition to that.)
[...]
This was one of the things delaying me wanting to merge this but in some other conversations with @Andrew-S-Rosen I decided that for this PR: i) this is an opt-in feature, and I'm fairly comfortable in this context with "you can opt into a worse-on-one-axis, better-on-another-axis" feature. More concerning is how this affects performances in the default case, but I think (without measuring) that it is noise around the function call stack and so I'm not super concerned. ii) many execution paths already have quite heavy object-recreation behaviour that looks kinda like this: for example to any remote executor like HighThroughputExecutor, parameters go through a serialization/deserialization reconstruction. So I expect there to be measurable performance change here, I expect there are much nicer ways to do it, users aren't subject to it initially, if it becomes a problem, someone can pay more attention later on. |
Merging what is here now. I'm working on this a bit more now, so hopefully there will be a follow-up PR addressing some more of @khk-globus 's comments. |
Wonderful. Thank you both!! |
…hich hangs - rather than even giving an error directly
## Summary of Changes Closes #1776. Requires: - Parsl/parsl#3111 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Description
This PR allows users to do things like store futures inside data structures such as dictionaries, which is a style of workflow that @Andrew-S-Rosen is especially enthusiastic about.
Changed Behaviour
By default nothing. This new behaviour will be (I think) somewhat slower when enabled, as a tradeoff for that newer functionality, but I have not quantified that.
Type of change