-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Users should be able to define custom resources in worker groups #167
Comments
@ebr Are you using an old |
That is possible - we've been using |
@ebr in https://github.com/ray-project/kuberay/blob/master/docs/best-practice/worker-head-reconnection.md indicates how to set 0 cpus, by setting the startup parameter num-cpus. We also have been trying to configure custom resources for our worker groups. But we didn't achieve to launch the pod. If we configure the worker group with:
But the pod creation fails with:
The weird thing is when change it a little bit a different error appears:
Error:
In the commit d54ea70 there is the following comment:
But there is no "demostration below". |
@juangtato-ds Thank you for pointing me at this! i figured it out - the "unfortunate format" is that you must escape the double quotes. So when deploying Ray clusters using the Helm chart, this worked:
resulting in the following command in the pod spec:
|
@ebr thanks! Didn't try out that one. It also worked for us. For this scenario, maybe
|
Current way of specifying resource in Ray start params is pretty painful, definitely this should be fixed. |
Starting to work on this now. |
^ Ray scheduler will skip the head node when scheduling workloads. |
We decided to stick with rayStartParams["resources"] as the way do this: |
Search before asking
Description
When using the original
ray-operator
, thecluster.ray.io/v1
API included a spec forrayResource
, which could be used for tagging worker groups as providers of custom, user-defined resources. This seems to be missing from theray.io/v1alpha1
API, and it would be useful to have it back.Use case
A use case for this might be to deploy a heterogenous cluster with multiple worker groups, where each worker group uses a different image packaged with different 3rd-party utilities. Some tasks that require specific utilities could then be marked as requiring such resource, and only execute on the workers that provide it.
Related issues
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: