[Bug] Head Node service is not getting exposed on Internal Load Balancer despite proper annotations #637

snow01 · 2022-10-16T05:47:58Z

Search before asking

I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

I am trying to set up the head node service exposed as LoadBalancer but on an internal one, but KubeRay is still using public LoadBalancer despite giving proper annotations.

I have all the necessary modules installed in my EKS cluster, such as AWS Load Balancer, and I also have a sample service working to expose internal LoadBalancer. That way, I know my annotations are correct.

For the ray cluster setup, I have given annotations both at the top level and within the template - hoping it will be picked from one of them.

Pls see the config snippet below.

Reproduction script

apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
  labels:
    controller-tools.k8s.io: "1.0"
  name: ${RAY_CLUSTER}
  namespace: ${RAY_CLUSTER_NS}
  annotations:
   service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
   service.beta.kubernetes.io/aws-load-balancer-scheme: internal
   service.beta.kubernetes.io/aws-load-balancer-internal: "0.0.0.0/0"
   service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
spec:
  rayVersion: '2.0.0'
  enableInTreeAutoscaling: true

  autoscalerOptions:
    upscalingMode: Default
resources.
    idleTimeoutSeconds: 60
    imagePullPolicy: Always
required.
    resources:
      limits:
        cpu: "1"
        memory: "2Gi"
      requests:
        cpu: "1"
        memory: "2Gi"
  ######################headGroupSpec#################################
  headGroupSpec:
    serviceType: LoadBalancer
    enableIngress: true

    rayStartParams:
      # Flag "no-monitor" will be automatically set when autoscaling is enabled.
      dashboard-host: '0.0.0.0'
      block: 'true'
      num-cpus: '0' # can be auto-completed from the limits
    template:
      metadata:
        annotations:
          service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
          service.beta.kubernetes.io/aws-load-balancer-scheme: internal
          service.beta.kubernetes.io/aws-load-balancer-internal: "0.0.0.0/0"
          service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip
...

Anything else

No response

Are you willing to submit a PR?

Yes I am willing to submit a PR!

The text was updated successfully, but these errors were encountered:

snow01 · 2022-10-19T14:25:57Z

any updates on this, pls ?

kevin85421 · 2022-10-19T17:52:09Z

any updates on this, pls ?

Thank you for your followup! I started to take a look at the ingress support in KubeRay and found some questions. Currently, I am working on #646. The PR is a blocker of #637. Thank you for your patience.

kevin85421 · 2022-10-22T00:14:36Z

Hi @snow01,

I started to work on this issue by getting some context of aws-load-balancer-controller, ALB, and NLB. I will try to reproduce this issue on my EKS.

I have several questions on this issue.

I am trying to set up the head node service exposed as LoadBalancer but on an internal one, but KubeRay is still using public LoadBalancer despite giving proper annotations.

Q1: What is the definition of "internal one"? Do you mean aws-load-balancer-scheme: internal?
Q2: What is the definition of "public LoadBalancer"? Do you mean aws-load-balancer-scheme: internet-facing?

service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
service.beta.kubernetes.io/aws-load-balancer-internal: "0.0.0.0/0"
service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip

Q3: Why the value of aws-load-balancer-internal is "0.0.0.0/0"? This doc indicates that the value should be boolean. In addition, it was deprecated in favor of aws-load-balancer-scheme. The following is the deprecation note in doc. Maybe this annotation is unnecessary?

This annotation is deprecated starting v2.2.0 release in favor of the new aws-load-balancer-scheme annotation. It will be supported, but in case of ties, the aws-load-balancer-scheme gets precedence.

Q4: The doc indicate that nlb-ip is deprecated. In addition, the backwards compatibility note for nlb-ip type in the doc said that nlb-ip is equivalent to aws-load-balancer-type: external and aws-load-balancer-nlb-target-type: ip. Although there is no conflict, maybe it is better to use external instead.

For backwards compatibility, both in-tree and AWS Load Balancer controller supports nlb-ip as value of service.beta.kubernetes.io/aws-load-balancer-type annotation. The controllers treats it as if you specified both annotation below:
```
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
```

Q5: Why do we need enableIngress: true?

Q6: What's the result of kubectl describe $YOUR_HEAD_SERVICE?

Thank you!

kevin85421 · 2022-10-22T00:51:19Z

I took a look at the source code. The annotations will not be passed to the head service. We can double-check this with the answer to Q6 mentioned above. If so, I will update KubeRay to enable the annotation propagations.

kuberay/ray-operator/controllers/ray/common/service.go

Lines 13 to 52 in 3f7b34c

    
           func BuildServiceForHeadPod(cluster rayiov1alpha1.RayCluster, labels map[string]string) (*corev1.Service, error) { 
        
           	if labels == nil { 
        
           		labels = make(map[string]string) 
        
           	} 
        
           	default_labels := map[string]string{ 
        
           		RayClusterLabelKey:                cluster.Name, 
        
           		RayNodeTypeLabelKey:               string(rayiov1alpha1.HeadNode), 
        
           		RayIDLabelKey:                     utils.CheckLabel(utils.GenerateIdentifier(cluster.Name, rayiov1alpha1.HeadNode)), 
        
           		KubernetesApplicationNameLabelKey: ApplicationName, 
        
           		KubernetesCreatedByLabelKey:       ComponentName, 
        
           	} 
        
           	for k, v := range default_labels { 
        
           		if _, ok := labels[k]; !ok { 
        
           			labels[k] = v 
        
           		} 
        
           	} 
        
           	service := &corev1.Service{ 
        
           		ObjectMeta: metav1.ObjectMeta{ 
        
           			Name:      utils.GenerateServiceName(cluster.Name), 
        
           			Namespace: cluster.Namespace, 
        
           			Labels:    labels, 
        
           		}, 
        
           		Spec: corev1.ServiceSpec{ 
        
           			Selector: labels, 
        
           			Ports:    []corev1.ServicePort{}, 
        
           			Type:     cluster.Spec.HeadGroupSpec.ServiceType, 
        
           		}, 
        
           	} 
        
           	ports := getServicePorts(cluster) 
        
           	for name, port := range ports { 
        
           		svcPort := corev1.ServicePort{Name: name, Port: port} 
        
           		service.Spec.Ports = append(service.Spec.Ports, svcPort) 
        
           	} 
        
           	return service, nil 
        
           }

kevin85421 · 2022-10-24T20:07:41Z

Gentle ping @snow01

snow01 · 2022-10-25T09:14:37Z

Yes, the main issue is that annotations are not getting passed to the head service.

snow01 · 2022-10-25T12:28:37Z

Q1: What is the definition of "internal one"? Do you mean aws-load-balancer-scheme: internal?

Yes

Q2: What is the definition of "public LoadBalancer"? Do you mean aws-load-balancer-scheme: internet-facing?
Yes

Q3: Why the value of aws-load-balancer-internal is "0.0.0.0/0"? This doc indicates that the value should be boolean. In addition, it was deprecated in favor of aws-load-balancer-scheme. The following is the deprecation note in doc. Maybe this annotation is unnecessary?

You are right, this should not matter.

Q4: The doc indicate that nlb-ip is deprecated. In addition, the backwards compatibility note for nlb-ip type in the doc said that nlb-ip is equivalent to aws-load-balancer-type: external and aws-load-balancer-nlb-target-type: ip. Although there is no conflict, maybe it is better to use external instead.

This is also deprecated. So doesn't matter.

Following configurations are the one that should matter -

service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal

kevin85421 · 2022-10-25T20:08:19Z

Thank @snow01 for your reply!

I have added this issue to today's KubeRay community sync agenda. If you are interested in the discussion, welcome to join the community sync!

(1) Agenda
(2) Google Meet

snow01 · 2022-10-29T03:38:03Z

Let me try this as per doc and confirm back. Thanks

kevin85421 · 2023-06-23T21:27:26Z

KubeRay has already exposed the entire head service for users #1040. Everything should be configurable now.

snow01 added the bug Something isn't working label Oct 16, 2022

kevin85421 self-assigned this Oct 18, 2022

kevin85421 mentioned this issue Oct 20, 2022

[Bug] Fail to create ingress due to the deprecation of the ingress.class annotation #646

Merged

4 tasks

kevin85421 mentioned this issue Oct 26, 2022

[Feature][Docs] AWS Application Load Balancer (ALB) support #657

Closed

2 tasks

DmitriGekhtman added the P2 Important issue, but not time critical label Nov 4, 2022

DmitriGekhtman added this to the v0.5.0 release milestone Nov 4, 2022

DmitriGekhtman mentioned this issue Nov 10, 2022

[Feature] Expose RayCluster service annotations #708

Closed

2 tasks

kevin85421 modified the milestones: v0.5.0 release, v0.6.0 release Apr 6, 2023

kevin85421 closed this as completed Jun 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Head Node service is not getting exposed on Internal Load Balancer despite proper annotations #637

[Bug] Head Node service is not getting exposed on Internal Load Balancer despite proper annotations #637

snow01 commented Oct 16, 2022

snow01 commented Oct 19, 2022

kevin85421 commented Oct 19, 2022

kevin85421 commented Oct 22, 2022

kevin85421 commented Oct 22, 2022

kevin85421 commented Oct 24, 2022

snow01 commented Oct 25, 2022

snow01 commented Oct 25, 2022

kevin85421 commented Oct 25, 2022

snow01 commented Oct 29, 2022

kevin85421 commented Jun 23, 2023

[Bug] Head Node service is not getting exposed on Internal Load Balancer despite proper annotations #637

[Bug] Head Node service is not getting exposed on Internal Load Balancer despite proper annotations #637

Comments

snow01 commented Oct 16, 2022

Search before asking

KubeRay Component

What happened + What you expected to happen

Reproduction script

Anything else

Are you willing to submit a PR?

snow01 commented Oct 19, 2022

kevin85421 commented Oct 19, 2022

kevin85421 commented Oct 22, 2022

kevin85421 commented Oct 22, 2022

kevin85421 commented Oct 24, 2022

snow01 commented Oct 25, 2022

snow01 commented Oct 25, 2022

kevin85421 commented Oct 25, 2022

snow01 commented Oct 29, 2022

kevin85421 commented Jun 23, 2023