Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KubeRay] [Serve] RayServices on KubeRay don't use async handles #28908

Closed
shrekris-anyscale opened this issue Sep 29, 2022 · 3 comments
Closed
Assignees
Labels
bug Something that is supposed to be working; but isn't kuberay Issues for the Ray/Kuberay integration that are tracked on the Ray side P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue

Comments

@shrekris-anyscale
Copy link
Contributor

shrekris-anyscale commented Sep 29, 2022

What happened + What you expected to happen

Serve deployment graphs use async handles. This requires two await statements. However, RayServices on KubeRay raise an error when attempting to use the double await statements.

Versions / Dependencies

Ray 2.0.0

Reproduction script

Example 1: This script runs locally on Ray 2.0.0 but fails on KubeRay:

import ray
from ray import serve
from ray.serve.drivers import DAGDriver
from ray.serve.deployment_graph import InputNode

# These imports are used only for type hints:
from typing import Dict, List
from starlette.requests import Request
from ray.serve.deployment_graph import ClassNode
from ray.serve.handle import RayServeDeploymentHandle


@serve.deployment(num_replicas=2)
class FruitMarket:
    def __init__(
        self,
        mango_stand: RayServeDeploymentHandle,
        orange_stand: RayServeDeploymentHandle,
        pear_stand: RayServeDeploymentHandle,
    ):
        self.directory = {
            "MANGO": mango_stand,
            "ORANGE": orange_stand,
            "PEAR": pear_stand,
        }

    async def check_price(self, fruit: str, amount: float) -> float:
        if fruit not in self.directory:
            return -1
        else:
            fruit_stand = self.directory[fruit]
            return await (await fruit_stand.check_price.remote(amount))


@serve.deployment(user_config={"price": 3})
class MangoStand:

    DEFAULT_PRICE = 1

    def __init__(self):
        # This default price is overwritten by the one specified in the
        # user_config through the reconfigure() method.
        self.price = self.DEFAULT_PRICE

    def reconfigure(self, config: Dict):
        self.price = config.get("price", self.DEFAULT_PRICE)

    def check_price(self, amount: float) -> float:
        return self.price * amount


@serve.deployment(user_config={"price": 2})
class OrangeStand:

    DEFAULT_PRICE = 0.5

    def __init__(self):
        # This default price is overwritten by the one specified in the
        # user_config through the reconfigure() method.
        self.price = self.DEFAULT_PRICE

    def reconfigure(self, config: Dict):
        self.price = config.get("price", self.DEFAULT_PRICE)

    def check_price(self, amount: float) -> float:
        return self.price * amount


@serve.deployment(user_config={"price": 4})
class PearStand:

    DEFAULT_PRICE = 0.75

    def __init__(self):
        # This default price is overwritten by the one specified in the
        # user_config through the reconfigure() method.
        self.price = self.DEFAULT_PRICE

    def reconfigure(self, config: Dict):
        self.price = config.get("price", self.DEFAULT_PRICE)

    def check_price(self, amount: float) -> float:
        return self.price * amount


async def json_resolver(request: Request) -> List:
    return await request.json()


with InputNode() as query:
    fruit, amount = query[0], query[1]

    mango_stand = MangoStand.bind()
    orange_stand = OrangeStand.bind()
    pear_stand = PearStand.bind()

    fruit_market = FruitMarket.bind(mango_stand, orange_stand, pear_stand)

    net_price = fruit_market.check_price.bind(fruit, amount)

deployment_graph = DAGDriver.bind(net_price, http_adapter=json_resolver)

Example 2:

This KubeRay config runs as expected. You can test it with

curl -X POST -H 'Content-Type: application/json' localhost:8000 -d '["MANGO", 2]'

However, it fails if you replace the working_dir with this URL:

"https://github.com/ray-project/test_dag/archive/40d61c141b9c37853a7014b8659fc7f23c1d04f6.zip"

The original working_dir points to this file, which uses a single await. The above working_dir URL points to this file, which uses two awaits.

Issue Severity

Medium: It is a significant difficulty but I can work around it.

@shrekris-anyscale shrekris-anyscale added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) serve Ray Serve Related Issue kuberay Issues for the Ray/Kuberay integration that are tracked on the Ray side P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 29, 2022
@shrekris-anyscale
Copy link
Contributor Author

This is the PR that added the double await to Ray 2.0. I exec’d into a worker pod on K8s, and I used the Python interpreter to successfully import some functions/classes that the PR added. So it looks like the functionality that requires the double await does exist in the KubeRay pods, but for some reason it’s not being used.

@simon-mo
Copy link
Contributor

Abstraction wise, RayService operator should have nothing to do with this. Do we know the exact Ray image used? and Ray version on the pod?

@shrekris-anyscale
Copy link
Contributor Author

Debugged with @simon-mo. We discovered that the Kuberay operator turns off the async handle feature by default using an env var.

You can turn it back on by explicitly turning on the feature:

# In both the headGroupSpec and the workerGroupSpecs,
# add this env var:

env:
    - name: SERVE_DEPLOYMENT_HANDLE_IS_SYNC
      value: "0"

After that, the double await should work as expected.

We’ll also enable this on KubeRay, so you don’t need to explicitly turn this on for future releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't kuberay Issues for the Ray/Kuberay integration that are tracked on the Ray side P1 Issue that should be fixed within a few weeks serve Ray Serve Related Issue
Projects
None yet
Development

No branches or pull requests

2 participants