-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: worker node can't connect to head node service #445
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Kevin Su <[email protected]>
Jeffwan
reviewed
Aug 9, 2022
@@ -43,7 +43,7 @@ func BuildIngressForHeadService(cluster rayiov1alpha1.RayCluster) (*networkingv1 | |||
PathType: &pathType, | |||
Backend: networkingv1.IngressBackend{ | |||
Service: &networkingv1.IngressServiceBackend{ | |||
Name: utils.GenerateServiceName(cluster.Name), | |||
Name: utils.CheckName(utils.GenerateServiceName(cluster.Name)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We internally use short names to overcome such issues but agree that this is indeed a problem worth revisiting.
Jeffwan
approved these changes
Aug 9, 2022
@@ -47,11 +47,11 @@ func IsRunningAndReady(pod *corev1.Pod) bool { | |||
|
|||
// CheckName makes sure the name does not start with a numeric value and the total length is < 63 char | |||
func CheckName(s string) string { | |||
maxLenght := 50 // 63 - (max(8,6) + 5 ) // 6 to 8 char are consumed at the end with "-head-" or -worker- + 5 generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch on the typo
Jeffwan
pushed a commit
to Jeffwan/kuberay
that referenced
this pull request
Aug 9, 2022
Signed-off-by: Kevin Su <[email protected]>
Jeffwan
added a commit
that referenced
this pull request
Aug 10, 2022
* Fix nil pointer dereference (#429) Signed-off-by: Kevin Su <[email protected]> * Fix wrong ray start command (#431) Signed-off-by: Kevin Su <[email protected]> * Add ray state api doc link in ray service doc (#428) * Add ray state api doc link in ray service doc * Update doc * update * [doc] Fix config typos Signed-off-by: Dmitri Gekhtman <[email protected]> Fixes a couple of typos in recently introduced sample configs. * Add http resp code check for kuberay (#435) * Clean up example samples (#434) This PR cleans up the "complete" and "autoscaler" sample yamls a bit. Unnecessary pod spec fields are removed without sacrificing the completeness of the examples. The idea is to make the configuration look less intimidating. Signed-off-by: Dmitri Gekhtman <[email protected]> * Add more env for RayService head or worker pods (#439) * fix: worker node can't connect to head node service (#445) Signed-off-by: Kevin Su <[email protected]> * helm-chart/ray-cluster: allow head autoscaling (#443) Also allow setting rayVersion Signed-off-by: Christos Kotsis <[email protected]> * Disable async serve handler in Ray Service cluster (#447) * Add wget timeout to probes (#448) * Enable tests against release-0.3 branch Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Christos Kotsis <[email protected]> Co-authored-by: Kevin Su <[email protected]> Co-authored-by: bruce <[email protected]> Co-authored-by: Dmitri Gekhtman <[email protected]> Co-authored-by: Christos Kotsis <[email protected]> Co-authored-by: Yi Cheng <[email protected]> Co-authored-by: Wilson Wang <[email protected]>
lowang-bh
pushed a commit
to lowang-bh/kuberay
that referenced
this pull request
Sep 24, 2023
Signed-off-by: Kevin Su <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Kevin Su [email protected]
Why are these changes needed?
If Ray job name is too long, the worker node will try to connect to the wrong head node service.
The service name is
l94lnxh5ktptbgzsfnj-n0-0-raycluster-xg6hj-head-svc
, but worker node try to connect toal94lnxh5ktptbgzsfnj-n0-0-raycluster-xg6hj-head-svc:6379
.Related issue number
Checks