-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Refactor planner.py #45706
[data] Refactor planner.py #45706
Conversation
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: Hao Chen <[email protected]>
def plan_input_data_op( | ||
logical_op: InputData, physical_children: List[PhysicalOperator] | ||
) -> PhysicalOperator: | ||
"""Get the corresponding DAG of physical operators for InputData.""" | ||
assert len(physical_children) == 0 | ||
|
||
return InputDataBuffer( | ||
input_data=logical_op.input_data, | ||
input_data_factory=logical_op.input_data_factory, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: IMO this'll be easier to read if we consistently define plan functions in separate modules. I think mixing the two approaches (function definitions in separate modules and in this function) hurts readability.
Not heavily opinionated on this, so am okay to keep as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some plan functions are just one line. I feel it's simpler and cleaner to just put them here. For those complex plan functions, it's better to put them in separate files.
PLAN_LOGICAL_OP_FNS.append((logical_op_type, plan_fn)) | ||
|
||
|
||
def register_plan_logical_op_fns(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe something like this to disambiguate from register_plan_logical_op_fn
?
def register_plan_logical_op_fns(): | |
def register_default_plan_logical_op_fns(): |
From the name, I'd expect something like this:
register_plan_logical_op_fns(plan_fns: List[PlanLogicalOpFn])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
from ray.data._internal.planner.plan_write_op import plan_write_op | ||
from ray.util.annotations import DeveloperAPI | ||
|
||
LogicalOperatorType = TypeVar("LogicalOperatorType", bound=LogicalOperator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What're the advantages of using this over Type[LogicalOperator]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function argument should be a subclass of LogicalOperator. If using Type[LogicalOperator]
, it requires exactly LogicalOperator
.
Signed-off-by: Hao Chen <[email protected]>
`planner.py` has too many if-else branches. refactor code to make the code cleaner and easier to extend in the future. --------- Signed-off-by: Hao Chen <[email protected]> Signed-off-by: Richard Liu <[email protected]>
Why are these changes needed?
planner.py
has too many if-else branches. refactor code to make the code cleaner and easier to extend in the future.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.