ray-project · ericl · Nov 19, 2021 · Nov 9, 2021 · Nov 10, 2021 · Nov 17, 2021
@@ -49,10 +49,18 @@ def f():
     pass
 
 @ray.remote
-class Actor:
+class A:
     pass
 
 # __per_task_per_actor_start__
 f.options(runtime_env=runtime_env).remote()
-Actor.options(runtime_env=runtime_env).remote()
+actor = A.options(runtime_env=runtime_env).remote()
+
+@ray.remote(runtime_env=runtime_env)
+def g():
+    pass
+
+@ray.remote(runtime_env=runtime_env)
+class B:
+    pass
 # __per_task_per_actor_end__
@@ -313,7 +313,7 @@ sufficient CPU resources and the relevant custom resources.
 .. tip::
 
   Besides compute resources, you can also specify an environment for an actor to run in,
-  which can include Python packages, local files, environment variables, and more--see :ref:`Runtime Environments <runtime-environments>` for details.
+  which can include Python packages, local files, environment variables, and more---see :ref:`Runtime Environments <runtime-environments>` for details.
 
 
 Terminating Actors

@@ -155,6 +155,7 @@ run ``ray attach --help``.
     # Attach to tmux session on cluster (creates a new one if none available)
     $ ray attach cluster.yaml --tmux
 
+.. _ray-rsync:
 
 Synchronizing files from the cluster (``ray rsync-up/down``)
 ------------------------------------------------------------

@@ -0,0 +1,216 @@
+.. _handling_dependencies:
+
+Handling Dependencies
+=====================
+
+Your Ray application may depend on environment variables, files, and Python packages.
+Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands
-Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands
+Ray provides two features to specify these dependencies when working with a Ray cluster: :ref:`runtime Environments<runtime-environments>`, and the :ref:`Ray cluster launcher commands <INSERT THE RIGHT LINK>`.
-Ray provides two features to specify these dependencies when working with a remote cluster: Runtime environments, and the Ray cluster launcher commands
+Ray provides two features to specify these dependencies when working with a Ray cluster: :ref:`runtime Environments<runtime-environments>`, and the :ref:`Ray cluster launcher commands <INSERT THE RIGHT LINK>`.
+With these features, you no longer need to manually SSH into your cluster and set up your environment.
+
+**Option 1.**  You can specify dependencies dynamically at runtime in Python using :ref:`Runtime Environments<runtime-environments>`, described below.  
+This can be useful for
+
+ - quickly iterating on a project with changing dependencies and files, and 
+
+ - for running jobs, tasks and actors with different dependencies, all on the same Ray cluster.
+
+**Option 2.**  Alternatively, you can prepare your Ray cluster's environment when your cluster nodes start up, and modify it later from the command line.
+Packages can be installed using ``setup_commands`` in the Ray Cluster configuration file (:ref:`docs<cluster-configuration-setup-commands>`) and files can be pushed to the cluster using ``ray rsync_up`` (:ref:`docs<ray-rsync>`).
+
+Concepts
+--------
+
+- **Local machine** and **Cluster**.  The recommended way to connect to a remote Ray cluster is to use :ref:`Ray Client<ray-client>`, and we will call the machine running Ray Client your *local machine*.  Note: you can also start a single-node Ray cluster on your local machine---in this case your Ray cluster is not really “remote”, but any comments in this documentation referring to a “remote cluster” will also apply to this setup.
+
+- **Files**: These are the files that your Ray application needs to run.  These can include code files or data files.  For a development workflow, these might live on your local machine, but when it comes time to run things at scale, you will need to get them to your remote cluster.  For how to do this, see :ref:`Workflow: Local Files<workflow-local-files>` below.
+
+- **Packages**: These are external libraries or executables required by your Ray application, often installed via ``pip`` or ``conda``.
+
+.. _runtime-environments:
+
+Runtime Environments
+--------------------
+
+.. note::
+
+    This feature requires a full installation of Ray using ``pip install "ray[default]>=1.4"``, and is currently only supported on macOS and Linux.
+
+A **runtime environment** describes the dependencies your Ray application needs to run, including files, packages, environment variables, and more.  It is specified by a Python ``dict``.
+
+By using runtime environments to specify all your dependencies, you can seamlessly move from running your Ray application on your local machine to running it on a remote cluster, without any code changes or manual setup.
+
+Here are a few examples of runtime environments (for full details, see the :ref:`API Reference<runtime-environments-api-ref>` below):
+
+..
+  TODO(architkulkarni): run working_dir doc example in CI
+
+.. code-block:: python
+
+    runtime_env = {"working_dir": "/data/my_files", "pip": ["requests", "pendulum==2.1.2"]}
+
+.. literalinclude:: ../examples/doc_code/runtime_env_example.py
+   :language: python
+   :start-after: __runtime_env_conda_def_start__
+   :end-before: __runtime_env_conda_def_end__
+
+Specifying a Runtime Environment Per-Job
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can specify a runtime environment for your whole job, whether running a script directly on the cluster or using :ref:`Ray Client<ray-client>`:
+
+.. literalinclude:: ../examples/doc_code/runtime_env_example.py
+   :language: python
+   :start-after: __ray_init_start__
+   :end-before: __ray_init_end__
+
+..
+  TODO(architkulkarni): run Ray Client doc example in CI
+
+.. code-block:: python
+
+    # Running on a local machine, connecting to remote cluster using Ray Client
+    ray.init("ray://123.456.7.89:10001", runtime_env=runtime_env)
+
+.. note::
+  This will eagerly install the environment when ``ray.init()`` is called.  To disable this, add ``"eager_install": False`` to the ``runtime_env``.  This will only install the environment when a task is invoked or an actor is created.
+
+Specifying a Runtime Environment Per-Task or Per-Actor
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can specify different runtime environments per-actor or per-task using ``.options()`` or the ``@ray.remote()`` decorator:
+
+.. literalinclude:: ../examples/doc_code/runtime_env_example.py
+   :language: python
+   :start-after: __per_task_per_actor_start__
+   :end-before: __per_task_per_actor_end__
+
+This allows you to have actors and tasks running in their own environments, independent of the surrounding environment. (The surrounding environment could be the job's runtime environment, or the base environment of the cluster.)
+
+.. _workflow-local-files:
+
+Workflow: Local files
+^^^^^^^^^^^^^^^^^^^^^
+Your Ray application might depend on source files or data files.
+The following simple example explains how to get your local files on the cluster.
+
+.. code-block:: python
+
+  import ray
+
+  # /path/to/files is a directory on the local machine.
+  # /path/to/files/hello.txt contains the string "Hello World!"
+  ray.init(runtime_env={"working_dir": "/path/to/files"})
+
+  @ray.remote
+  def f():
+      return open("hello.txt").read()
+
+  print(ray.get(f.remote())) # Hello World!
+
+The example above starts a single-node Ray cluster on your local machine, but by specifying an address (e.g. ``ray.init("ray://123.456.7.89:10001", runtime_env=...)`` to connect to a remote cluster using Ray Client, the same code will still work, and ``f`` will be running on the remote cluster.
+
+The specified local directory will automatically be pushed to the cluster nodes when `ray.init()` is called.
+
+Workflow: Ray Library development
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Suppose you are developing a library ``my_module`` on Ray.
+A typical iteration cycle will involve making some changes to the source code of ``my_module`` and then running a Ray script to test the changes.  Perhaps your test requires a remote cluster.  To ensure your local changes show up on the cluster, you can use the ``py_modules`` field of runtime environments.
+
+.. code-block:: python
+
+  import ray
+  import my_module
+
+  ray.init("ray://123.456.7.89:10001", runtime_env={"py_modules": [my_module]})
+
+  @ray.remote
+  def test_my_module():
+      # No need to import my_module inside this function.
+      my_module.test()
+
+  ray.get(f.remote())
+
+.. _runtime-environments-api-ref:
+
+API Reference
+^^^^^^^^^^^^^
+
+The ``runtime_env`` is a Python dictionary including one or more of the following keys:
+
+- ``working_dir`` (str): Specifies the working directory for the Ray workers. This must either be an existing directory on the local machine with total size at most 100 MiB, or a path to a zip file stored in Amazon S3.
+  If a local directory is specified, it will be uploaded to each node on the cluster.
+  Ray workers will be started in their node's copy of this directory.
+
+  - Examples
+
+    - ``"."  # cwd``
+
+    - ``"/src/my_project"``
+
+    - ``"s3://path/to/my_dir.zip"``
+
+  Note: Setting a local directory per-task or per-actor is currently unsupported.
+
+  Note: If your local directory contains a ``.gitignore`` file, the files and paths specified therein will not be uploaded to the cluster.
+
+- ``py_modules`` (List[str|module]): Specifies Python modules to import in the Ray workers.  (For more ways to specify packages, see also the ``pip`` and ``conda`` fields below.)
+  Each entry must be either (1) a path to a local directory, (2) a path to a zip file stored in Amazon S3, or (3) a Python module object.  
+
+  - Examples
+
+    - ``"."``
+
+    - ``"/src/my_module"``
+
+    - ``"s3://path/to/my_module.zip"``
+
+    - ``my_module # Assumes my_module has already been imported, e.g. via 'import my_module'``
+
+  Note: Note: Setting options (1) and (3) per-task or per-actor is currently unsupported.
+
+  Note: For option (1), if your local directory contains a ``.gitignore`` file, the files and paths specified therein will not be uploaded to the cluster.
+- ``excludes`` (List[str]): When used with ``working_dir`` or ``py_modules``, specifies a list of files or paths to exclude from being uploaded to the cluster.
+  This field also supports the pattern-matching syntax used by ``.gitignore`` files: see `<https://git-scm.com/docs/gitignore>`_ for details.
+
+  - Example: ``["my_file.txt", "path/to/dir", "*.log"]``
+
+- ``pip`` (List[str] | str): Either a list of pip `requirements specifiers <https://pip.pypa.io/en/stable/cli/pip_install/#requirement-specifiers>`_, or a string containing the path to a pip
+  `“requirements.txt” <https://pip.pypa.io/en/stable/user_guide/#requirements-files>`_ file.
+  This will be installed in the Ray workers at runtime.
+  To use a library like Ray Serve or Ray Tune, you will need to include ``"ray[serve]"`` or ``"ray[tune]"`` here.
+
+  - Example: ``["requests==1.0.0", "aiohttp", "ray[serve]"]``
+
+  - Example: ``"./requirements.txt"``
+
+- ``conda`` (dict | str): Either (1) a dict representing the conda environment YAML, (2) a string containing the path to a
+  `conda “environment.yml” <https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#create-env-file-manually>`_ file,
+  or (3) the name of a local conda environment already installed on each node in your cluster (e.g., ``"pytorch_p36"``).
+  In the first two cases, the Ray and Python dependencies will be automatically injected into the environment to ensure compatibility, so there is no need to manually include them.
+  Note that the ``conda`` and ``pip`` keys of ``runtime_env`` cannot both be specified at the same time---to use them together, please use ``conda`` and add your pip dependencies in the ``"pip"`` field in your conda ``environment.yaml``.
+
+  - Example: ``{"dependencies": ["pytorch", “torchvision”, "pip", {"pip": ["pendulum"]}]}``
+
+  - Example: ``"./environment.yml"``
+
+  - Example: ``"pytorch_p36"``
+
+
+- ``env_vars`` (Dict[str, str]): Environment variables to set.
+
+  - Example: ``{"OMP_NUM_THREADS": "32", "TF_WARNINGS": "none"}``
+
+- ``eager_install`` (bool): Indicates whether to install the runtime env at `ray.init()` time, before the workers are leased. This flag is set to ``True`` by default.
+  Currently, specifying this option per-actor or per-task is not supported.
+
+  - Example: ``{"eager_install": False}``
+
+The runtime environment is inheritable, so it will apply to all tasks/actors within a job and all child tasks/actors of a task or actor, once set.
+
+If a child actor or task specifies a new ``runtime_env``, it will be merged with the parent’s ``runtime_env`` via a simple dict update.
+For example, if ``runtime_env["pip"]`` is specified, it will override the ``runtime_env["pip"]`` field of the parent.
+The one exception is the field ``runtime_env["env_vars"]``.  This field will be `merged` with the ``runtime_env["env_vars"]`` dict of the parent.
+This allows for an environment variables set in the parent's runtime environment to be automatically propagated to the child, even if new environment variables are set in the child's runtime environment.
+
+Here are some examples of runtime environments combining multiple options:
+
@@ -279,7 +279,7 @@ is set.  In particular, it's also called when new replicas are created in the
 future if scale up your deployment later.  The `reconfigure` method is also  called
 each time `user_config` is updated.
 
-Dependency Management
+Handling Dependencies
 =====================
 
 Ray Serve supports serving deployments with different (possibly conflicting)