Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support for overwriting the generated ray start command with a user-specified container command #1704

Merged
merged 3 commits into from
Dec 4, 2023

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Dec 2, 2023

Why are these changes needed?

  • Add a new environment variable KUBERAY_GEN_RAY_START_CMD for both head and worker Pods to store the ray start command generated by KubeRay.
  • If users add the annotation ray.io/overwrite-container-cmd: "true" to RayCluster, KubeRay will respect the container command/args provided by users.
    • Users should specify something like ulimit -n 65536 and ["/bin/bash", "-lc", "--"] by themselves.

Related issue number

Closes #1560

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(
  • Build this PR, deploy the KubeRay operator, and create a CR by ray-cluster.overwrite-command.yaml.
    • head Pod: check Command, Args, and environment variable KUBERAY_GEN_RAY_START_CMD.
      Screen Shot 2023-12-01 at 10 16 13 PM
    • worker Pod: check Command, Args, and environment variable KUBERAY_GEN_RAY_START_CMD.
      Screen Shot 2023-12-01 at 10 16 41 PM

@kevin85421 kevin85421 changed the title WIP [Feature] Support for overwriting the generated ray start command with a user-specified container command Dec 2, 2023
@kevin85421 kevin85421 marked this pull request as ready for review December 2, 2023 06:25
@kevin85421
Copy link
Member Author

cc @kevingreer @findmyway

Copy link
Contributor

@architkulkarni architkulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and tests look good!

@@ -0,0 +1,60 @@
# This example config is used to describe how does the annotation "ray.io/overwrite-container-cmd" work.
# See kuberay/docs/guidance/pod-command.md for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# See kuberay/docs/guidance/pod-command.md for more details.
# See kuberay/docs/guidance/pod-command.md for more details.

Is this link still accurate? More generally, what's the plan to document this feature? (I guess we'll add a section in the doc in the Ray repo?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, what's the plan to document this feature? (I guess we'll add a section in the doc in the Ray repo?)

Yes, I will update the doc later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 5ad6b08.

# Because the annotation "ray.io/overwrite-container-cmd" is set to "true",
# KubeRay will overwrite the generated container command with `command` and
# `args` in the following. Hence, you need to specify the `ulimit` command
# by yourself to avoid Ray scalability issues.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be helpful for the user if we remind them what $KUBERAY_GEN_RAY_START_CMD means here, even though you already described it in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 5ad6b08.

@kevin85421 kevin85421 merged commit a45e4ab into ray-project:master Dec 4, 2023
25 checks passed
architkulkarni added a commit to ray-project/ray that referenced this pull request Dec 8, 2023
Update doc for ray-project/kuberay#1704.

---------

Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Archit Kulkarni <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Expose generated ray start command for worker and head nodes.
2 participants