diff --git a/doc/source/ray-core/configure.rst b/doc/source/ray-core/configure.rst index d204ddb6fb78..46c8d38f9421 100644 --- a/doc/source/ray-core/configure.rst +++ b/doc/source/ray-core/configure.rst @@ -12,6 +12,8 @@ and from the command line. Take a look at the ``ray.init`` `documentation .. important:: For the multi-node setting, you must first run ``ray start`` on the command line to start the Ray cluster services on the machine before ``ray.init`` in Python to connect to the cluster services. On a single machine, you can run ``ray.init()`` without ``ray start``, which will both start the Ray cluster services and connect to them. +.. _cluster-resources: + Cluster Resources ----------------- diff --git a/doc/source/ray-core/miscellaneous.rst b/doc/source/ray-core/miscellaneous.rst index c75777ea86e9..9c20ce31282c 100644 --- a/doc/source/ray-core/miscellaneous.rst +++ b/doc/source/ray-core/miscellaneous.rst @@ -75,6 +75,8 @@ appear as the task name in the logs. .. image:: images/task_name_dashboard.png +.. _accelerator-types: + Accelerator Types ------------------ diff --git a/doc/source/serve/scaling-and-resource-allocation.md b/doc/source/serve/scaling-and-resource-allocation.md index a7d6b79d444f..47148b8213aa 100644 --- a/doc/source/serve/scaling-and-resource-allocation.md +++ b/doc/source/serve/scaling-and-resource-allocation.md @@ -123,6 +123,29 @@ def func_2(*args): In this example, each replica of each deployment will be allocated 0.5 GPUs. The same can be done to multiplex over CPUs, using `"num_cpus"`. +### Custom Resources, Accelerator types, and more + +You can also specify {ref}`custom resources ` in `ray_actor_options`, for example to ensure that a deployment is scheduled on a specific node. +For example, if you have a deployment that requires 2 units of the `"custom_resource"` resource, you can specify it like this: + +```python +@serve.deployment(ray_actor_options={"resources": {"custom_resource": 2}}) +def func(*args): + return do_something_with_my_custom_resource() +``` + +You can also specify {ref}`accelerator types ` via the `accelerator_type` parameter in `ray_actor_options`. + +Below is the full list of supported options in `ray_actor_options`; please see the relevant Ray Core documentation for more details about each option: + +- `accelerator_type` +- `memory` +- `num_cpus` +- `num_gpus` +- `object_store_memory` +- `resources` +- `runtime_env` + (serve-omp-num-threads)= ## Configuring Parallelism with OMP_NUM_THREADS diff --git a/python/ray/serve/api.py b/python/ray/serve/api.py index 611b665a408c..7a2754954c93 100644 --- a/python/ray/serve/api.py +++ b/python/ray/serve/api.py @@ -317,7 +317,9 @@ def deployment( a '/' unless they're the root (just '/'), which acts as a catch-all. ray_actor_options: Options to be passed to the Ray actor - constructor such as resource requirements. + constructor such as resource requirements. Valid options are + `accelerator_type`, `memory`, `num_cpus`, `num_gpus`, + `object_store_memory`, `resources`, and `runtime_env`. user_config (Optional[Any]): Config to pass to the reconfigure method of the deployment. This can be updated dynamically without changing the version of the deployment and diff --git a/python/ray/serve/config.py b/python/ray/serve/config.py index c11d233d17d7..34187b99e338 100644 --- a/python/ray/serve/config.py +++ b/python/ray/serve/config.py @@ -390,7 +390,8 @@ def _validate_ray_actor_options(self) -> None: f'Got invalid type "{type(self.ray_actor_options)}" for ' "ray_actor_options. Expected a dictionary." ) - + # Please keep this in sync with the docstring for the ray_actor_options + # kwarg in api.py. allowed_ray_actor_options = { # Resource options "accelerator_type",