-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Pod updates should trigger RayCluster reconciliation #872
Comments
architkulkarni
added
bug
Something isn't working
P2
Important issue, but not time critical
labels
Jan 18, 2023
4 tasks
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Jan 23, 2023
ray-project#639 accidentally applied event filters for child resources Pods and Services. This change does not filter Pod or Service related events. This means Pod updates will trigger RayCluster reconciliation. closes ray-project#872
4 tasks
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Jan 23, 2023
ray-project#639 accidentally applied event filters for child resources Pods and Services. This change does not filter Pod or Service related events. This means Pod updates will trigger RayCluster reconciliation. closes ray-project#872
@architkulkarni @DmitriGekhtman thanks for noticing this! I just proposed a fix in #882. I don't know how to test it manually though. Lmk if it looks good. |
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Jan 25, 2023
ray-project#639 accidentally applied event filters for child resources Pods and Services. This change does not filter Pod or Service related events. This means Pod updates will trigger RayCluster reconciliation. closes ray-project#872
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Jan 25, 2023
ray-project#639 accidentally applied event filters for child resources Pods and Services. This change does not filter Pod or Service related events. This means Pod updates will trigger RayCluster reconciliation. closes ray-project#872
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Jan 25, 2023
ray-project#639 accidentally applied event filters for child resources Pods and Services. This change does not filter Pod or Service related events. This means Pod updates will trigger RayCluster reconciliation. closes ray-project#872
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
See #850 (comment) and related comments on that PR.
Currently, pod state changes don't trigger cluster reconciliation. This is due to an event filter:
kuberay/ray-operator/controllers/ray/raycluster_controller.go
Line 803 in 633ff63
We should not filter out pod status changes, but continue to filter out RayCluster status changes. We should take care to avoid reconciling in a tight loop, because
instance.Status.LastUpdateTime
is updated at each reconciliation.Reproduction script
Same reproduction as in #845. That issue will soon be closed by #850 using a separate fix (always reconciling every 60 seconds), but the present issue is still valid. This issue will fix the up to 60s delay between the cluster being ready and the Ray Job being submitted (this is triggered by the next periodic reconcile).
Anything else
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: