-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[runtime env] [Doc] Add concepts and basic workflows #20222
Conversation
Hey @architkulkarni can you add a PR description? |
also can I push directly? |
Adding one now, and yeah feel free to push |
doc/source/handling-dependencies.rst
Outdated
- for running jobs, tasks and actors with different dependencies, all on the same Ray cluster. | ||
|
||
**Option 2.** Alternatively, you can prepare your Ray cluster's environment when your cluster nodes start up, and modify it later from the command line. | ||
Packages can be installed using ``setup_commands`` in the Ray Cluster configuration file (:ref:`docs<cluster-configuration-setup-commands>`) and files can be pushed to the cluster using ``ray rsync_up`` (:ref:`docs<ray-rsync>`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkooo567 I know we still need more here but I'm not quite sure what to put, do you have any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should ask the autoscaler team to fill it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think common problems are
Manual:
- Link to autoscaler section that describes how to set up deps
- env variables (setup commands)
- System deps (setup commands)
- Files (rsync up or manually copy and paste. Make sure they are all synced)
- Python packages (setup commands)
Container
- Same things (link to container deployment)
doc/source/handling-dependencies.rst
Outdated
===================== | ||
|
||
Your Ray application may depend on environment variables, files, and Python packages. | ||
Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands | |
Ray provides two features to specify these dependencies when working with a Ray cluster: :ref:`runtime Environments<runtime-environments>`, and the :ref:`Ray cluster launcher commands <INSERT THE RIGHT LINK>`. |
doc/source/handling-dependencies.rst
Outdated
|
||
Your Ray application may depend on environment variables, files, and Python packages. | ||
Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands | ||
With these features, you no longer need to manually SSH into your cluster and set up your environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little confused with this sentence. We don't need to manually SSH into your cluster for the existing solution now right? (it is handled by the setup commands)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant to include setup commands in "Ray cluster launcher commands", which this doc describes as an existing feature. Let me make this more clear
Your Ray application may depend on environment variables, files, and Python packages. | ||
Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands | ||
With these features, you no longer need to manually SSH into your cluster and set up your environment. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should start problem the highest problem to low level options here.
Maybe we can describe it in this way instead?
- What's the environment in Ray?
- Why environment matters in Ray?
And then we can say
There are 2 ways to set up your Ray environment (e.g., files, environment variables, python package dependencies, system dependencies and etc.)
-
Set up the same environment across machines. This is the most common way to configure environments in Ray. You can use autoscaler's setup commands or docker container deployment. Blah blah... All of Ray tasks and actors will use the same environment as all machines are configured with the same environment. Pro is X con is Y (e.g., all jobs have to use the same environment.)
-
Set up per job/task/actor environment. This is useful when X (e.g., Serve or multi tenant cluster). In this case you can use runtime environment API blah blah.. Pro is X con is Y.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need a section regarding how to setup environment when Ray client is used, and runtime environment can be used as a good solution as well (or you should mention the local machine / remote cluster should have the same environment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is useful to have in the docs. Maybe we can put them in a top-level page under "Multi-Node Ray" which then links to this runtime env page.
doc/source/handling-dependencies.rst
Outdated
- for running jobs, tasks and actors with different dependencies, all on the same Ray cluster. | ||
|
||
**Option 2.** Alternatively, you can prepare your Ray cluster's environment when your cluster nodes start up, and modify it later from the command line. | ||
Packages can be installed using ``setup_commands`` in the Ray Cluster configuration file (:ref:`docs<cluster-configuration-setup-commands>`) and files can be pushed to the cluster using ``ray rsync_up`` (:ref:`docs<ray-rsync>`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think common problems are
Manual:
- Link to autoscaler section that describes how to set up deps
- env variables (setup commands)
- System deps (setup commands)
- Files (rsync up or manually copy and paste. Make sure they are all synced)
- Python packages (setup commands)
Container
- Same things (link to container deployment)
doc/source/handling-dependencies.rst
Outdated
Concepts | ||
-------- | ||
|
||
- **Local machine** and **Cluster**. The recommended way to connect to a remote Ray cluster is to use :ref:`Ray Client<ray-client>`, and we will call the machine running Ray Client your *local machine*. Note: you can also start a single-node Ray cluster on your local machine---in this case your Ray cluster is not really “remote”, but any comments in this documentation referring to a “remote cluster” will also apply to this setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true ray client is a recommended way? Afaik, it is a lot less stable to use ray client now than directly submitting the driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got that from here https://docs.ray.io/en/latest/cluster/guide.html#deploying-an-application "The recommended way of connecting to a Ray cluster is to use the ray.init("ray://:") API and connect via the Ray Client."
I'm not sure which is more stable, but you're right that we should be clear about which one is recommended
doc/source/handling-dependencies.rst
Outdated
|
||
- ``my_module # Assumes my_module has already been imported, e.g. via 'import my_module'`` | ||
|
||
Note: Note: Setting options (1) and (3) per-task or per-actor is currently unsupported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe having a separate section to explain what APIs are supported for per job or per actor / tasks? Like;
Supported APIs:
Jobs
- working dir
- conda env
- pymodule...
Per tasks/actors
- conda env
@@ -13,7 +13,7 @@ Finally, we've also included some content on using core Ray APIs with `Tensorflo | |||
starting-ray.rst | |||
actors.rst | |||
namespaces.rst | |||
dependency-management.rst | |||
handling-dependencies.rst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation of the name change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was suggested here #19863 (comment) @richardliaw is it because "Dependency Management" already has a meaning that's too specific?
Signed-off-by: Richard Liaw <[email protected]>
…into doc-py-modules
I think the only remaining open question is how much to include about the cluster launcher approach ( The current iteration of the PR doesn't mention the cluster launcher at all, but links to the Runtime Environments page from within "Multi-Node Ray > Ray Deployment Guide". I added some words in the cluster launcher section about environment variables and package installation. |
That’s fine, I think we should limit this to mostly runtime envs as you
have done so far and start moving the rest of the docs towards this as the
golden path.
…On Fri, Nov 19, 2021 at 9:11 AM architkulkarni ***@***.***> wrote:
I think the only remaining open question is how much to include about the
cluster launcher approach (setup_commands, rsync, directly submitting
driver script with address=auto, etc.). There seem to be different
opinions here and it probably depends on what we want to promote as a best
practice.
The current iteration of the PR doesn't mention the cluster launcher at
all, but links to the Runtime Environments page from within "Multi-Node Ray
> Ray Deployment Guide". I added some words in the cluster launcher section
about environment variables and package installation.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20222 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCRZZKH34SBTU7O57KMFNDUM2ANDANCNFSM5HYUSZFA>
.
|
Address followup comments from #19863 - Add short "Concepts" section - Add more section headings to break up the text - Add "Workflow: Local Files" example - Add "Workflow: Library development" example
Address followup comments from #19863 - Add short "Concepts" section - Add more section headings to break up the text - Add "Workflow: Local Files" example - Add "Workflow: Library development" example
Address followup comments from #19863 - Add short "Concepts" section - Add more section headings to break up the text - Add "Workflow: Local Files" example - Add "Workflow: Library development" example
Why are these changes needed?
Renaming the file made the diff hard to check, It might be easier to review by just scanning through the Buildkite docs build. (Or you can just check this commit 26676de)
TODO: Move new code samples to files that are tested in CI
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.