Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Helm chart default configuration #530

Merged
merged 5 commits into from
Sep 2, 2022

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Sep 2, 2022

Why are these changes needed?

There are 3 problems in current helm chart default configurations.

  • [Q1] There is a mismatch between labels in ray-head pod and selector in ray-head ClusterIP service. See [Bug] The mismatched labels configured in RayCluster Helm chart #495 for more details.

    • [Solution]: I updated the value of nameOverride from ray to kuberay as mentioned by @jeethridge.
  • [Q2] When I solved Q1, I found dashboard could not be launched successfully. The root cause is that dashboard_agent would round down the CPU request config (200m) to 0.

    • [Solution]: Increase CPU request from 200m to 1.
  • [Q3] After I solved Q2, the dashboard still could not be rendered on my browser. The root cause is that the image rayproject/ray:latest points to a very old version. Thank @hckuo for pointing this out!

    • [Solution]: Updated the image

Related issue number

#495

Checks

# Step1: Install operator
kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/crd?ref=v0.3.0&timeout=90s"
helm install kuberay-operator --namespace ray-system --create-namespace $(curl -s https://api.github.com/repos/ray-project/kuberay/releases/latest | grep '"browser_download_url":' | sort | grep -om1 'https.*helm-chart-kuberay-operator.*tgz')

# Step2: Install ray-cluster
cd helm-chart/ray-cluster
helm install ray-cluster --namespace ray-system --create-namespace .

# Step3: port-forwarding
kubectl port-forward -n ray-system services/ray-cluster-kuberay-head-svc 8265:8265 

# Step4: Type 127.0.0.1:8265 in your browser to check dashboard. I added a screenshot for dashboard below.

# Step5: Submit a job to your ray cluster. The result is as shown in the following screenshot.
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"

Screen Shot 2022-09-02 at 3 51 01 PM

Screen Shot 2022-09-01 at 4 33 12 PM

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421 kevin85421 changed the title (WIP) Helm Fix Helm chart default configuration Sep 2, 2022
@kevin85421
Copy link
Member Author

cc @DmitriGekhtman

@kevin85421 kevin85421 marked this pull request as ready for review September 2, 2022 23:04
Copy link
Collaborator

@DmitriGekhtman DmitriGekhtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Thanks for fixing this!

@DmitriGekhtman DmitriGekhtman merged commit ad104f9 into ray-project:master Sep 2, 2022
DmitriGekhtman pushed a commit that referenced this pull request Sep 16, 2022
… for head node (#572)

As mentioned in #495 and #530, the value of nameOverride needs to be hardcoded with the value "kuberay" to avoid the mismatch between head pod and head service.

The value of app.kubernetes.io/name in the head pod's labels is same as app.kubernetes.io/name specified in RayCluster.Spec.HeadGroupSpec.Template.ObjectMeta.Labels of RayCluster custom resource YAML file. See the function labelPod in pod.go for more details.

RayCluster.Spec.HeadGroupSpec.Template.ObjectMeta.Labels["app.kubernetes.io/name"] is controlled by the value of nameOverride in helm chart.

However, the value of app.kubernetes.io/name in the head pod service's selector is hardcoded with the constant variable ApplicationName (= "kuberay")
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
… for head node (ray-project#572)

As mentioned in ray-project#495 and ray-project#530, the value of nameOverride needs to be hardcoded with the value "kuberay" to avoid the mismatch between head pod and head service.

The value of app.kubernetes.io/name in the head pod's labels is same as app.kubernetes.io/name specified in RayCluster.Spec.HeadGroupSpec.Template.ObjectMeta.Labels of RayCluster custom resource YAML file. See the function labelPod in pod.go for more details.

RayCluster.Spec.HeadGroupSpec.Template.ObjectMeta.Labels["app.kubernetes.io/name"] is controlled by the value of nameOverride in helm chart.

However, the value of app.kubernetes.io/name in the head pod service's selector is hardcoded with the constant variable ApplicationName (= "kuberay")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants