Skip to content

Commit

Permalink
[RLlib] Cleanup examples folder vol. 25: Remove some old API stack ex…
Browse files Browse the repository at this point in the history
…amples. (ray-project#47970)

Signed-off-by: ujjawal-khare <[email protected]>
  • Loading branch information
sven1977 authored and ujjawal-khare committed Oct 15, 2024
1 parent 7f7e893 commit 3421c18
Show file tree
Hide file tree
Showing 7 changed files with 2 additions and 614 deletions.
4 changes: 0 additions & 4 deletions doc/source/rllib/rllib-examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -254,12 +254,8 @@ RLModules
- |old_stack| `How to using the "Repeated" space of RLlib for variable lengths observations <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/complex_struct_space.py>`__:
How to use RLlib's `Repeated` space to handle variable length observations.
- |old_stack| `How to write a custom Keras model <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/custom_keras_model.py>`__:
Example of using a custom Keras model.
- |old_stack| `How to register a custom model with supervised loss <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_model_loss_and_metrics.py>`__:
Example of defining and registering a custom model with a supervised loss.
- |old_stack| `How to train with batch normalization <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/models/batch_norm_model.py>`__:
Example of adding batch norm layers to a custom model.
- |old_stack| `How to write a custom model with its custom API <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_model_api.py>`__:
Shows how to define a custom Model API in RLlib, such that it can be used inside certain algorithms.
- |old_stack| `How to write a "trajectory ciew API" utilizing model <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/models/trajectory_view_utilizing_models.py>`__:
An example on how a model can use the trajectory view API to specify its own input.
Expand Down
56 changes: 2 additions & 54 deletions doc/source/rllib/rllib-models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -364,59 +364,7 @@ calculating head on top of your policy model. In order to expand a Model's API,
define and implement a new method (e.g. ``get_q_values()``) in your TF- or TorchModelV2 sub-class.

You can now wrap this new API either around RLlib's default models or around
your custom (``forward()``-overriding) model classes. Here are two examples that illustrate how to do this:

**The Q-head API: Adding a dueling layer on top of a default RLlib model**.

The following code adds a ``get_q_values()`` method to the automatically chosen
default Model (e.g. a ``FullyConnectedNetwork`` if the observation space is a 1D Box
or Discrete):

.. literalinclude:: ../../../rllib/examples/_old_api_stack/models/custom_model_api.py
:language: python
:start-after: __sphinx_doc_model_api_1_begin__
:end-before: __sphinx_doc_model_api_1_end__

Now, for your algorithm that needs to have this model API to work properly (e.g. DQN),
you use this following code to construct the complete final Model using the
``ModelCatalog.get_model_v2`` factory function (`code here <https://github.com/ray-project/ray/blob/master/rllib/models/catalog.py>`__):

.. literalinclude:: ../../../rllib/examples/custom_model_api.py
:language: python
:start-after: __sphinx_doc_model_construct_1_begin__
:end-before: __sphinx_doc_model_construct_1_end__

With the model object constructed above, you can get the underlying intermediate output (before the dueling head)
by calling ``my_dueling_model`` directly (``out = my_dueling_model([input_dict])``), and then passing ``out`` into
your custom ``get_q_values`` method: ``q_values = my_dueling_model.get_q_values(out)``.


**The single Q-value API for SAC**.

Our DQN model from above takes an observation and outputs one Q-value per (discrete) action.
Continuous SAC - on the other hand - uses Models that calculate one Q-value only
for a single (**continuous**) action, given an observation and that particular action.

Let's take a look at how we would construct this API and wrap it around a custom model:

.. literalinclude:: ../../../rllib/examples/_old_api_stack/models/custom_model_api.py
:language: python
:start-after: __sphinx_doc_model_api_2_begin__
:end-before: __sphinx_doc_model_api_2_end__

Now, for your algorithm that needs to have this model API to work properly (e.g. SAC),
you use this following code to construct the complete final Model using the
``ModelCatalog.get_model_v2`` factory function (`code here <https://github.com/ray-project/ray/blob/master/rllib/models/catalog.py>`__):

.. literalinclude:: ../../../rllib/examples/custom_model_api.py
:language: python
:start-after: __sphinx_doc_model_construct_2_begin__
:end-before: __sphinx_doc_model_construct_2_end__

With the model object constructed above, you can get the underlying intermediate output (before the q-head)
by calling ``my_cont_action_q_model`` directly (``out = my_cont_action_q_model([input_dict])``), and then passing ``out``
and some action into your custom ``get_single_q_value`` method:
``q_value = my_cont_action_q_model.get_signle_q_value(out, action)``.
your custom (``forward()``-overriding) model classes.


More examples for Building Custom Models
Expand Down Expand Up @@ -505,7 +453,7 @@ Supervised Model Losses

You can mix supervised losses into any RLlib algorithm through custom models. For example, you can add an imitation learning loss on expert experiences, or a self-supervised autoencoder loss within the model. These losses can be defined over either policy evaluation inputs, or data read from `offline storage <rllib-offline.html#input-pipeline-for-supervised-losses>`__.

**TensorFlow**: To add a supervised loss to a custom TF model, you need to override the ``custom_loss()`` method. This method takes in the existing policy loss for the algorithm, which you can add your own supervised loss to before returning. For debugging, you can also return a dictionary of scalar tensors in the ``metrics()`` method. Here is a `runnable example <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_model_loss_and_metrics.py>`__ of adding an imitation loss to CartPole training that is defined over a `offline dataset <rllib-offline.html#input-pipeline-for-supervised-losses>`__.
**TensorFlow**: To add a supervised loss to a custom TF model, you need to override the ``custom_loss()`` method. This method takes in the existing policy loss for the algorithm, which you can add your own supervised loss to before returning. For debugging, you can also return a dictionary of scalar tensors in the ``metrics()`` method.

**PyTorch**: There is no explicit API for adding losses to custom torch models. However, you can modify the loss in the policy definition directly. Like for TF models, offline datasets can be incorporated by creating an input reader and calling ``reader.next()`` in the loss forward pass.

Expand Down
175 changes: 0 additions & 175 deletions rllib/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -2836,107 +2836,6 @@ py_test(
args = ["--enable-new-api-stack", "--as-test"]
)


#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_local_cpu_torch",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=torch", "--config=local-cpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_local_cpu_tf2",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=tf2", "--config=local-cpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_local_gpu_torch",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "gpu"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=torch", "--config=local-gpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_local_gpu_tf2",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "gpu", "exclusive"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=tf2", "--config=local-gpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_remote_cpu_torch",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=torch", "--config=remote-cpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_remote_cpu_tf2",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=tf2", "--config=remote-cpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_remote_gpu_torch",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "gpu", "exclusive"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=torch", "--config=remote-gpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_remote_gpu_tf2",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "gpu", "exclusive"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=tf2", "--config=remote-gpu"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_multi_gpu_torch",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "multi_gpu", "exclusive"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=torch", "--config=multi-gpu-ddp"]
)

#@OldAPIStack @HybridAPIStack
py_test(
name = "examples/learners/ppo_tuner_multi_gpu_tf2",
main = "examples/learners/ppo_tuner.py",
tags = ["team:rllib", "examples", "multi_gpu", "exclusive"],
size = "medium",
srcs = ["examples/learners/ppo_tuner.py"],
args = ["--framework=tf2", "--config=multi-gpu-ddp"]
)

# subdirectory: multi_agent/
# ....................................
py_test(
Expand Down Expand Up @@ -3256,56 +3155,6 @@ py_test(
args = ["--as-test", "--framework=torch", "--stop-reward=-0.012", "--num-cpus=4"]
)

#@OldAPIStack
py_test(
name = "examples/cartpole_lstm_impala_tf2",
main = "examples/cartpole_lstm.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "medium",
srcs = ["examples/cartpole_lstm.py"],
args = ["--run=IMPALA", "--as-test", "--framework=tf2", "--stop-reward=28", "--num-cpus=4"]
)

#@OldAPIStack
py_test(
name = "examples/cartpole_lstm_impala_torch",
main = "examples/cartpole_lstm.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "medium",
srcs = ["examples/cartpole_lstm.py"],
args = ["--run=IMPALA", "--as-test", "--framework=torch", "--stop-reward=28", "--num-cpus=4"]
)

#@OldAPIStack
py_test(
name = "examples/cartpole_lstm_ppo_tf2",
main = "examples/cartpole_lstm.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "large",
srcs = ["examples/cartpole_lstm.py"],
args = ["--run=PPO", "--as-test", "--framework=tf2", "--stop-reward=28", "--num-cpus=4"]
)

#@OldAPIStack
py_test(
name = "examples/cartpole_lstm_ppo_torch",
main = "examples/cartpole_lstm.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "medium",
srcs = ["examples/cartpole_lstm.py"],
args = ["--run=PPO", "--as-test", "--framework=torch", "--stop-reward=28", "--num-cpus=4"]
)

#@OldAPIStack
py_test(
name = "examples/cartpole_lstm_ppo_torch_with_prev_a_and_r",
main = "examples/cartpole_lstm.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "medium",
srcs = ["examples/cartpole_lstm.py"],
args = ["--run=PPO", "--as-test", "--framework=torch", "--stop-reward=28", "--num-cpus=4", "--use-prev-action", "--use-prev-reward"]
)

#@OldAPIStack
py_test(
name = "examples/centralized_critic_tf",
Expand Down Expand Up @@ -3356,30 +3205,6 @@ py_test(
args = ["--stop-iters=2"]
)

#@OldAPIStack
py_test(
name = "examples/custom_model_loss_and_metrics_ppo_tf",
main = "examples/custom_model_loss_and_metrics.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "small",
# Include the json data file.
data = ["tests/data/cartpole/small.json"],
srcs = ["examples/custom_model_loss_and_metrics.py"],
args = ["--run=PPO", "--stop-iters=1", "--framework=tf","--input-files=tests/data/cartpole"]
)

#@OldAPIStack
py_test(
name = "examples/custom_model_loss_and_metrics_ppo_torch",
main = "examples/custom_model_loss_and_metrics.py",
tags = ["team:rllib", "exclusive", "examples"],
size = "small",
# Include the json data file.
data = ["tests/data/cartpole/small.json"],
srcs = ["examples/custom_model_loss_and_metrics.py"],
args = ["--run=PPO", "--framework=torch", "--stop-iters=1", "--input-files=tests/data/cartpole"]
)

py_test(
name = "examples/custom_recurrent_rnn_tokenizer_repeat_after_me_tf2",
main = "examples/custom_recurrent_rnn_tokenizer.py",
Expand Down
94 changes: 0 additions & 94 deletions rllib/examples/cartpole_lstm.py

This file was deleted.

Loading

0 comments on commit 3421c18

Please sign in to comment.