[data] Should warn if no CPU resources are available at the start of execution. #31507
Labels
data
Ray Data-related issues
data-observability
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
A common anti-pattern for Dataset users is to try using Datasets when the cluster is completely full of Actors. For example, this can easily happen with Ray Tune, or using multiprocessing.Pool.
While Ray / the Ray autoscaler does print a warning in this case, it's pretty generic and hard to understand (e.g., "{cpu: 1} task cannot be scheduled, please check if your cluster is full of actors").
Specifically for Datasets, we could check if
ray.available_resources()["cpu"]
is empty at the start of execution, and if so print out a nicer warning with link to documentation. The message could read something like "Warning: The Ray cluster currently does not have any available CPUs. The Dataset job will hang unless more CPUs are freed up. A common reason is that cluster resources are used by Actors or Tune trials, see link for more details."cc @dmatrix
The text was updated successfully, but these errors were encountered: