[RLlib] Add separate learning rates for policy and `alpha` to SAC. #47078

…his imrpoves learning a bit. Signed-off-by: simonsays1980 <[email protected]>

@sven1977

…d set 'lr' to 'None' as requested in @sven1977's review. Furthermore, changed all examples and tuning scripts. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: Sven Mika <[email protected]>

…#47105)

…ses. (ray-project#47057) They were used to fetch / publish logs and errors, but now they are replaced by PythonGcsSubscriber cython binded classes. Signed-off-by: Ruiyang Wang <[email protected]> Signed-off-by: Ruiyang Wang <[email protected]>

… submitter (ray-project#47109) Signed-off-by: Jiajun Yao <[email protected]>

…t#47115) So these source files serving as dependency for doc files always get rebuilt correctly. --------- Signed-off-by: khluu <[email protected]>

) Split out `TestHTTPProxy` and `TestgRPCProxy` into a unit test file. Signed-off-by: Cindy Zhang <[email protected]>

…ay-project#47117) The current codebase includes `env_bool` and `env_integer` functions that directly convert environment variable strings into their respective types. To extend this functionality, we also need an `env_float` function to safely convert strings representing floating-point numbers into the `float` type." Signed-off-by: Hongpeng Guo <[email protected]>

## Why are these changes needed?  Fix a wrong variable name for a feature introduced in ray-project#46699, which caused progress bars to not show % progress / render the bar itself. After the changes in this PR, the progress bar shows % progress as desired: ![Screenshot at Aug 13 14-48-08](https://github.com/user-attachments/assets/f5fc5188-f33e-468c-a460-d3f115293e36) ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: Scott Lee <[email protected]>

meaning it is tracking the latest version, so that we do not need to update the names of this one when we want to update the pyarrow version we are using. Signed-off-by: Lonnie Liu <[email protected]>

…ct#47121) Following up from ray-project#47082, we actually have 6 different data builds, with this matrix ``` python 3.9 python 3.12 arrow 6 X X arrow 17 X X arrow nightly X X ``` They all share the same build environment (https://github.com/ray-project/ray/blob/master/ci/docker/data.build.Dockerfile), but we have 6 configurations of these build environments given the above matrix This PR updates other flavors to use arrow 17 as well Test: - CI Signed-off-by: can <[email protected]>

…ide a task or actor Signed-off-by: Peter Nguyen <[email protected]>

…roject#47114) Add the rest of missing API references for rllib. We can also now enable the API policy lint checker for rllib, now that all missing references are documented Test: - CI <img width="1351" alt="Screenshot 2024-08-13 at 12 15 08 PM" src="https://github.com/user-attachments/assets/cc1d1c8e-763e-4d2e-a7d1-28243a7fdbab"> Signed-off-by: can <[email protected]>

## Why are these changes needed? Lights up and adds a bounce animation when the search input is open - used css sibling selector to target the ask AI button when the search input and overlay is open, to add a bounce and brighten the button by 50% as well as raise the z index so it is visible over the overlay - JS to ensure that search is closed when ask AI is clicked https://www.loom.com/share/c84653be0b4a42debaa71f361c11dc42?sid=58e13baf-30bc-4524-a569-9d26af2d7175 Signed-off-by: cristianjd <[email protected]>

ray-project#47053) When a new set of `RunningReplicaInfos` are broadcasted to a router, the nested actor handles are "empty" and don't hold the necessary actor info (e.g. actor address) to send a request to that replica. Upon first request, the handle fetches that info from the GCS. If the GCS goes down immediately after a replica set change is broadcasted to a router, requests will all be blocked until the GCS recovers. Fix: - Upon receiving a new replica set, the router actively probes the queue lengths for each replica. - On proxies, also push its self actor handle to replicas upon replica set change, else proxy requests to new replicas will hang when GCS is down. Signed-off-by: Cindy Zhang <[email protected]>

…t#47095) ## Why are these changes needed? During the execution of an operator, display whether or not an operator is backpressured. This will more easily surface to the viewer if there are portions of their dataset execution that are bottlenecking the other operations. When backpressured the display will change from `Map(g): 3 active, 22 queued, [cpu: 0.3, objects: 768.0MB]: : 0.00 row [02:14, ? row/s` to `Map(g): 3 active, 22 queued, BACKPRESSURED, [cpu: 0.3, objects: 768.0MB]: : 0.00 row [02:14, ? row/s`. ### Examples: #### Typical example ```python import ray import time def f(x): time.sleep(0.1) return x ray.data.range(1000).map(f).map(f, num_cpus=0.1).materialize() ``` https://github.com/user-attachments/assets/8283fdff-d94f-43c3-a9f0-c929924c441f #### Backpressure example ```python import ray import time def f(x): time.sleep(0.1) return x def g(x): time.sleep(10000000) return x ray.data.range(1000).map(f).map(g, num_cpus=0.1).materialize() ``` https://github.com/user-attachments/assets/d263b8ee-8dd2-4ec6-bd45-c55941c45253 --------- Signed-off-by: Matthew Owen <[email protected]>

Add ray[adag] option to pip install

Currently we have no linting on any part of the docs code. This PR runs pre-commit on the train docs. This PR fixes the following issues: ``` trim trailing whitespace.................................................Failed - hook id: trailing-whitespace - exit code: 1 - files were modified by this hook Fixing doc/source/tune/api/doc/ray.air.integrations.mlflow.MLflowLoggerCallback.rst Fixing doc/source/tune/api/suggestion.rst Fixing doc/source/tune/faq.rst Fixing doc/source/tune/tutorials/tune-resources.rst Fixing doc/source/tune/api/doc/ray.air.integrations.comet.CometLoggerCallback.rst Fixing doc/source/tune/tutorials/tune-lifecycle.rst Fixing doc/source/tune/api/doc/ray.air.integrations.wandb.WandbLoggerCallback.rst fix end of files.........................................................Failed - hook id: end-of-file-fixer - exit code: 1 - files were modified by this hook Fixing doc/source/tune/api/doc/ray.air.integrations.mlflow.MLflowLoggerCallback.rst Fixing doc/source/tune/examples/includes/pb2_example.rst Fixing doc/source/tune/examples/includes/async_hyperband_example.rst Fixing doc/source/tune/tutorials/tune-trial-checkpoints.rst Fixing doc/source/tune/api/doc/ray.air.integrations.wandb.setup_wandb.rst Fixing doc/source/tune/api/reporters.rst Fixing doc/source/tune/examples/includes/mnist_pytorch.rst Fixing doc/source/tune/examples/includes/hyperband_example.rst Fixing doc/source/tune/api/trainable.rst Fixing doc/source/tune/examples/includes/nevergrad_example.rst Fixing doc/source/tune/api/doc/ray.air.integrations.comet.CometLoggerCallback.rst Fixing doc/source/tune/examples/includes/logging_example.rst Fixing doc/source/tune/api/doc/ray.air.integrations.mlflow.setup_mlflow.rst Fixing doc/source/tune/examples/includes/pbt_tune_cifar10_with_keras.rst Fixing doc/source/tune/examples/ml-frameworks.rst Fixing doc/source/tune/examples/includes/xgboost_dynamic_resources_example.rst Fixing doc/source/tune/api/doc/ray.air.integrations.wandb.WandbLoggerCallback.rst check for added large files..............................................Passed check python ast.........................................................Passed check json...............................................................Passed check toml...........................................(no files to check)Skipped black....................................................................Passed flake8...................................................................Passed prettier.............................................(no files to check)Skipped mypy.................................................(no files to check)Skipped isort (python)...........................................................Passed rst directives end with two colons.......................................Passed rst ``inline code`` next to normal text..................................Passed use logger.warning(......................................................Passed check for not-real mock methods..........................................Passed ShellCheck v0.9.0....................................(no files to check)Skipped clang-format.........................................(no files to check)Skipped Google Java Formatter................................(no files to check)Skipped Check for Ray docstyle violations........................................Passed Check for Ray import order violations....................................Passed ``` Signed-off-by: pdmurray <[email protected]>

Update `start_service` to allow configuration of `image_uri` instead of `cluster_env`. Also allow configuring `working_dir`. Signed-off-by: Cindy Zhang <[email protected]>

…each actor to avoid deadlocks caused by NCCL operations (ray-project#46911) Generate an execution schedule for each actor. The schedule is a list of DAGNodeOperation. Step 1: Generate a graph based on the following rules: Divide a DAG node into three GraphNodes: READ, COMPUTE, and WRITE. Each GraphNode has a DAGNodeOperation. Add edges between READ and COMPUTE, and between COMPUTE and WRITE, which belong to the same task. Add an edge between COMPUTE with bind_index i and COMPUTE with bind_index i+1 if they belong to the same actor. Add an edge between WRITE of the writer task and READ of the reader task. Step 2: Topological sort: If there are multiple GraphNodes with zero in-degree, select one based on the following rules: (1) If the nodes are not NCCL write nodes, select the one with the smallest bind_index. If there are multiple candidate nodes with the smallest bind_index of the actors that they belong to, any one of them is acceptable. For the implementation details, we maintain a priority queue for each actor, where the peek of the priority queue is the node with the smallest bind_index. (2) If the node is an NCCL write node, select it only if all of its downstream nodes are also the peeks of their priority queues. (3) If (1) and (2) cannot be satisfied, it means that all candidate nodes are NCCL write nodes. In this case, select the one that is the peek of the priority queue and its downstream nodes, regardless of whether the downstream nodes are peeks of their priority queues or not. Then, put the selected nodes into the corresponding actors' schedules.

…oject#47085) The defaults were changed in 2.32, and the warnings have been around since 2.32 too. The warnings have been around for 4 releases now, we can remove them. Signed-off-by: Cindy Zhang <[email protected]>

even for serve deployments Signed-off-by: Lonnie Liu <[email protected]>

… constantly with TF in CI tests. This is old stack. Signed-off-by: simonsays1980 <[email protected]>

…brid stack is no longer supported. Signed-off-by: simonsays1980 <[email protected]>

…nently. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: simonsays1980 <[email protected]>

…and set 'lr' to 'None'. Furthermore, modified all learning rates to adapt to the number of learners. Signed-off-by: simonsays1980 <[email protected]>

… 'None' as needed for new stack SAC. Signed-off-by: simonsays1980 <[email protected]>

…ing rates in multi-agent SAC Pendulum tuned example to number of GPUs. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: simonsays1980 <[email protected]>

…Cheetah example. Signed-off-by: simonsays1980 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Add separate learning rates for policy and `alpha` to SAC. #47078

[RLlib] Add separate learning rates for policy and `alpha` to SAC. #47078

Commits on Aug 12, 2024

Commits on Aug 13, 2024

Commits on Aug 14, 2024

Commits on Aug 15, 2024

Commits on Aug 16, 2024

Commits on Aug 19, 2024

Commits on Aug 20, 2024

[RLlib] Add separate learning rates for policy and alpha to SAC. #47078

[RLlib] Add separate learning rates for policy and alpha to SAC. #47078

Commits on Aug 12, 2024

Commits on Aug 13, 2024

Commits on Aug 14, 2024

Commits on Aug 15, 2024

Commits on Aug 16, 2024

Commits on Aug 19, 2024

Commits on Aug 20, 2024

[RLlib] Add separate learning rates for policy and `alpha` to SAC. #47078

[RLlib] Add separate learning rates for policy and `alpha` to SAC. #47078