[Bug] Job Sample YAML ray_v1alpha1_rayjob.yaml
fails with empty node-ip-address
$MY_POD_IP
#805
Closed
2 tasks done
Labels
bug
Something isn't working
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
Following the documentation at https://ray-project.github.io/kuberay/guidance/rayjob/ using Kuberay master using
kind
, the worker pod starts to crashloop:Inspecting the logs, we see the traceback:
Reproduction script
kind create cluster
and then follow the docs page at https://ray-project.github.io/kuberay/guidance/rayjob/Anything else
From tracing through the Ray code, the error is happening because an empty
node-ip-address
was passed toray start
. There is a fieldnode-ip-address: $MY_POD_IP
in the sample Job yaml, so this environment variable must not have been set. I assume the Ray operator is supposed to set this environment variable, but I'm not sure where it gets set.Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: