Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] Should warn if no CPU resources are available at the start of execution. #31507

Closed
ericl opened this issue Jan 7, 2023 · 0 comments · Fixed by #31574
Closed

[data] Should warn if no CPU resources are available at the start of execution. #31507

ericl opened this issue Jan 7, 2023 · 0 comments · Fixed by #31574
Assignees
Labels
data Ray Data-related issues data-observability enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks

Comments

@ericl
Copy link
Contributor

ericl commented Jan 7, 2023

A common anti-pattern for Dataset users is to try using Datasets when the cluster is completely full of Actors. For example, this can easily happen with Ray Tune, or using multiprocessing.Pool.

While Ray / the Ray autoscaler does print a warning in this case, it's pretty generic and hard to understand (e.g., "{cpu: 1} task cannot be scheduled, please check if your cluster is full of actors").

Specifically for Datasets, we could check if ray.available_resources()["cpu"] is empty at the start of execution, and if so print out a nicer warning with link to documentation. The message could read something like "Warning: The Ray cluster currently does not have any available CPUs. The Dataset job will hang unless more CPUs are freed up. A common reason is that cluster resources are used by Actors or Tune trials, see link for more details."

cc @dmatrix

@ericl ericl added enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks data Ray Data-related issues data-observability labels Jan 7, 2023
@c21 c21 added the Ray 2.3 label Jan 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Ray Data-related issues data-observability enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks
Projects
None yet
3 participants