Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve][Master Task] Cleanup ray DAG -> serve DAG node types with clear layering #24061

Closed
jiaodong opened this issue Apr 20, 2022 · 0 comments
Closed
Assignees
Labels
enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue

Comments

@jiaodong
Copy link
Member

jiaodong commented Apr 20, 2022

Description

In latest discussion about dag layering, we agreed on the following:

  1. We allow stacked decorator '@ray.remote' first, '@serve.options' on top for Ray 2.0 to facilitate users using both ray & serve options in a single node or DAG. Same pattern applies to other libraries.

  2. To make migration easy and smooth for users, we keep the existing single @serve.deployment decorator as syntactic sugar, build a path to port serve options as "metadata" at ray, but don't need to disrupt or warn existing user behavior. This gives us the migration path as well as buffer needed for Ray 2.0

  3. Library specific nodes shouldn't be public API as we don't want users to be exposed to it or directly modify, only generated by .bind() on a decorated body.

  4. Serve DAG body is executable by ray core via ray.dag.execute(). We have serve.run() that executes serve dag by deploying all nodes, but while authoring and iterating the DAG, without serve specific stuff like DAGDriver that brings JSON serde & HTTP, the "body" of the DAG should be tested with ray.dag.execute(dag) with same code, even it's only decorated by @serve.deployment.

    • It will execute by dropping / using default serve configs to facilitate local testing.
    • This statement is somewhat true now and Simon and I have verified this behavior in our development of the demo DAG.

serve.run(ray_dag) means ..

  • It will start with transforming all functions into function deployments, and all class into class deployments, class method calls into calling class method on a deployment class (potentially multiple times, but should be same result for 1 replica case)
  • It will use all default serve options since there isn't any; But later we can ask it to take @ray.remote options as ray_actor_options field in each deployment
  • It will then deploy all deployment nodes, and return a handle to the root that can be called .remote()

=================================

Now from last branch cut we introduced a few things that didn't match the consensus above, including:
1) We added a few internal DeploymentNode / UserDeploymentNode classes that subclass ClassNode / FunctionNode, and registered them a bit differently in DAG PyObjScanner
2) These DeploymentNode types should no longer be user facing
3) We had quite a number of magic in DeploymentNode implementation now that handles placement, deployment config handling, json serde and DAGHandle ..
4) No tests or documentation that clearly covers the ray dag -> serve dag development path

Use case

.

@jiaodong jiaodong added enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue platform labels Apr 20, 2022
@jiaodong jiaodong added this to the Serve Application Polish milestone Apr 20, 2022
@jiaodong jiaodong self-assigned this Apr 20, 2022
@jiaodong jiaodong changed the title [Serve] Cleanup ray DAG -> serve DAG node types with clear layering [Serve][Master Task] Cleanup ray DAG -> serve DAG node types with clear layering Apr 20, 2022
ericl pushed a commit that referenced this issue Apr 21, 2022
…ve exposing DeploymentNode as public (#24065)

See dag layering summary in #24061

We need to cleanup and set right ray dag -> serve dag layering where `.bind()` can be called on `@serve.deployment` decorated class or func, but only returns raw Ray DAGNode type, executable by ray core and serve_dag is only available after serve-specific transformations.

Thus this PR removes exposed serve DAGNode type such as DeploymentNode.

It also removes the syntax of `class.bind().bind()` to return a `DeploymentMethodNode` that defaults to `__call__` to match same behavior in ray dag building.
@edoakes edoakes removed the platform label Apr 25, 2022
@hora-anyscale hora-anyscale assigned sihanwang41 and unassigned jiaodong Dec 7, 2022
@edoakes edoakes closed this as completed Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue
Projects
None yet
Development

No branches or pull requests

4 participants