-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RayCluster created by RayService set death info env for ray container #419
Conversation
@@ -513,6 +513,13 @@ func setContainerEnvVars(pod *v1.Pod, rayContainerIndex int, rayNodeType rayiov1 | |||
portEnv := v1.EnvVar{Name: RAY_PORT, Value: headPort} | |||
container.Env = append(container.Env, portEnv) | |||
} | |||
if createdByRayService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think thinking if there's better way to indicate a cluster is created by RayServe controller.
Can we leverage
KubernetesCreatedByLabelKey = "app.kubernetes.io/created-by" |
This help us not expand method arguments if we need some customization on clusters created by RayJob etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sg
The change looks good to me. Let's merge it after test complete |
@@ -513,6 +513,13 @@ func setContainerEnvVars(pod *v1.Pod, rayContainerIndex int, rayNodeType rayiov1 | |||
portEnv := v1.EnvVar{Name: RAY_PORT, Value: headPort} | |||
container.Env = append(container.Env, portEnv) | |||
} | |||
if strings.ToLower(creator) == RayServiceCreatorLabelValue { | |||
// Only add this env for Ray Service cluster to improve service SLA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we only need this in Ray Service? what about other scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iycheng mentioned this will increase SLA for ray service.
But for ray job, it will ignore the error info which is not good for observability. So for now, we keep it only for RayService.
…ray-project#419) * RayCluster created by RayService set death info env for ray container * update
Why are these changes needed?
For Ray Serve, if we set the death info time as 0, it will increase SLA. This pr sets the env for RayService create Ray containers.
Related issue number
Checks