-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submit ray job after cluster is ready #405
Conversation
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Actually this one looks straightforward, I can merge it and update #398 |
The master RayJob Controller needs lots of improvement. I think after this one and #398, it should be good |
if rayClusterInstance.Status.State != rayv1alpha1.Ready { | ||
r.Log.Info("waiting for the cluster to be ready", "rayCluster", rayClusterInstance.Name) | ||
err = r.updateState(ctx, rayJobInstance, rayv1alpha1.JobDeploymentStatusInitializing, nil) | ||
return ctrl.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: We can requeue it after x seconds since cluster ready takes a while. without requeueAfter we will see lots of duplicated logs.
This is minor, I can update it later in another PR
* Submit ray job after cluster is ready Signed-off-by: Kevin Su <[email protected]> * Fix test errors Signed-off-by: Kevin Su <[email protected]> * Fix test errors Signed-off-by: Kevin Su <[email protected]> * Fix test errors Signed-off-by: Kevin Su <[email protected]> * Fix test errors Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su [email protected]
Why are these changes needed?
Ray-operator try to submit the ray job when the ray cluster is not ready. we can check the cluster state first before submitting the job.
Before
After
Related issue number
Checks