-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fractional resource scheduling #258
Support fractional resource scheduling #258
Conversation
@carsonwang & team, please kindly let me know if you want a call to discuss this proposal. |
Use mock cluster based on doc here: https://docs.ray.io/en/latest/ray-core/examples/testing-tips.html#tip-4-create-a-mini-cluster-with-ray-cluster-utils-cluster
Thanks @pang-wu for the work! How will the gpu config be used as Spark actually is not aware of the gpu resource? |
@carsonwang To my understanding the GPU based actor scheduling/allocation will be done by Ray, spark's executor runs inside the actor. Whether the code inside Spark will actually use GPU is up to the user. But we actually want to solve the other side of the problem as well: if a cluster has GPU, Spark can still launch executor on the worker nodes for CPU only tasks using this config. Right now developers has to use mixed node cluster to run Spark job if they also want to run GPU workload in the same cluster. In most of our usecase, the Spark processing job is small, the current setup increase the setup complexity. |
GPU auto scaling is a bug on Ray side. For more details, please see [this issue](ray-project/ray#20476).
LGTM |
Support fractional CPU and GPU resource scheduling. This PR actually achieve three goals:
spark.ray.actor.resource.cpu
config.For more details, please refer to this RFC.