Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Add examples and docs for Catalog #33898

Merged

Conversation

ArturNiederfahrenhorst
Copy link
Contributor

Why are these changes needed?

This PR adds examples and docs for RLlib's Catalog class which is used to create Models and distributions for RL Modules.

Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
..
.. admonition:: This doc is related to the RLModules API.

Disregard this doc if you do not explicitly want to use RLModules. The RLModules API is experimental and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably just say something to the effect of RLModules is an experimental API. I think this gets across the point overall. Its implied what experimental means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a complete sentence to replace this with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RLModules api is currently in the experimental phase of its rollout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm while we're at it, is this a whole separate doc page right here? If so, why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an include. It's basically a banner that alerts users that they are looking at something experimental.
Angelina told me to make this basically the first thing people read - if it's not meant for John Doe, put it first.

CC @kouroshHakha to make you aware of this banner

Catalogs
========

Catalogs are where RL Modules primarily get their models and action distributions from.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Catalog is a tool that helps create default models and action distributions for RLModules to work with environments with any action and observation spaces. It provides an interface for constructing pre-made components like an MLP for policies that have continuous action spaces, and an action distribution (like a multivariate Gaussian) that matches that action space.

Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left several comments / suggestions. I had to overlook a lot of places that I would write / restructure things differently for the sake of time, but we can revisit them later, when reviewing our docs as a whole.

One thing that applies to this PR in general is that there are so many places that you are referencing a class, but there is not cross reference to the class API. So I would suggest creating API reference for catalogs and cross referencing them where ever it is needed. You can take a look at RLModule doc pr to see how many places I have referenced the api ref doc pages. This will immensely help with ease of navigation in your doc pages.

@@ -299,6 +299,7 @@ parts:
- file: rllib/rllib-sample-collection
- file: rllib/rllib-replay-buffers
- file: rllib/rllib-offline
- file: rllib/rllib-catalogs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this under user-guide for now. It's not a key concept yet since it's experimental. We have to later re-structure the key-concepts page to have better navigation around topics.

doc/source/rllib/key-concepts.rst Outdated Show resolved Hide resolved
doc/source/rllib/key-concepts.rst Outdated Show resolved Hide resolved
doc/source/rllib/key-concepts.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-catalogs.rst Show resolved Hide resolved
.. image:: images/catalog/ppo_catalog_and_rlm_diagram.svg
:align: center

Catalogs and AlgorithmConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this usage pattern should be explained earlier. It's the most common pattern people use after basic API.

Catalogs
========

Catalogs are where RL Modules primarily get their models and action distributions from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to mention first who the audience is. e.g. Catalog is for advanced users who want to extend existing RLModules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhhh. I do this 5 lines below.
I don't see it being the first sentence in the RLModules docs either: https://docs.ray.io/en/master/rllib/rllib-rlmodule.html
.. or in other docs we have.
Why would it be different here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put it second or third? Like, the first sentence should be a very general, high-level description of what user is looking at I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catalog is a secondary concept compared to RLModules and a lot of users (high-level API users) do not need to know about it unless they want to do some sort of extension to RLModules, that's why I think this is different.

You can add it in a note format, similar to how the initial note is structured in RLModules.
image

Something like

Note: This abstraction is intended for advanced users who are interested in extending the functionality of build-in ~ray.rllib.core.rl_module.rl_module.RLModule objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I agree.
Do you think we should add a similar matrix here?
Because afaics, 4 of the 5 columns here of at are actually Catalog features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think it should be enough to have this matrix with RLModules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally. I didn't mean to use this text, I meant to use the theme, i.e. the ..note:: syntax.

:start-after: __sphinx_doc_begin__
:end-before: __sphinx_doc_end__

.. image:: images/catalog/catalog_rlmspecs_diagram.svg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any utility / extra clarification from this diagram.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for illustration, not for extra clarification.
When skimming docs, the majority of information I ingest is through pictures.
Basically, users can get an idea of what they are looking at just by looking at the pictures to filter information without putting in lots of effort through reading.
@angelinalg lmk if this is stupid, I'm open to deleting this if I have the wrong idea here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add some extra caption to get the intention behind the the image across.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I've updated the illustration a little bit and changed the text around it to get the point across that in case of heterogeneous agents, you'll want to modify the multi-agent rl module spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it's good to have diagrams for visual learners.

You can make Catalog build custom models by overriding the Catalog’s methods used by RL Modules to build models.
Have a look at these lines from the constructor of the PPORLModules to see how Catalogs are being used by RLModules:

.. literalinclude:: ../../../rllib/algorithms/ppo/ppo_base_rl_module.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should ideally not reference ppo_base_rl_module.py for this part. Just create a ..code-block that shows the part that is important. This will be very restrictive for code development. I don't want to worry about the docs when changing the code structure in PPORLModule. You can also notice that the indentation is awkward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaics, this collides with what Max said here: #33909 (comment)
@maxpumperla do you have a rule here? Or am I misunderstanding on of you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I'm thinking about the risk that code outdates quickly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is a bit different. We want self-contained example code snippets to be tested all the time. But this is more like a high-lighted view of somewhere in the code-base. I think we should just snap that view out and put it in a self-contained context for the purpose of the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the problem persists. We state here that the following is a snippet from the constructor of the PPORLModule.
If it is not, but rather a copy of it, the code will be outdated as soon as the constructor is changed.

- How to write a custom action distribution
- How to inject a custom action distribution into a Catalog

.. literalinclude:: ../../../rllib/examples/catalog/custom_action_distribution.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a basic usage, we can keep it in examples/catalog folder.

ArturNiederfahrenhorst and others added 10 commits April 9, 2023 23:23
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@angelinalg
Copy link
Contributor

Sorry about closing that by mistake, @ArturNiederfahrenhorst

@angelinalg angelinalg closed this Apr 10, 2023
@angelinalg angelinalg reopened this Apr 10, 2023
Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@ArturNiederfahrenhorst
Copy link
Contributor Author

ArturNiederfahrenhorst commented Apr 10, 2023

There is a mix of using RL Module and RLModule in this file: doc/source/rllib/rllib-rlmodule.rst. I suggest being consistent. The majority of occurrences seem to be RLModule. The title should probably be consistent.

@kouroshHakha I'm making this change requested by Angelina within the scope of this PR.
Lmk if you object.

Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@angelinalg
Copy link
Contributor

@ArturNiederfahrenhorst, I did a quick pass and left some minor comments. I really like your diagrams!

@ArturNiederfahrenhorst
Copy link
Contributor Author

@ArturNiederfahrenhorst, I did a quick pass and left some minor comments. I really like your diagrams!

Awesome, thanks you so much!

Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Copy link
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sven1977 sven1977 merged commit 3c22ad6 into ray-project:master Apr 11, 2023
ArturNiederfahrenhorst added a commit to ArturNiederfahrenhorst/ray that referenced this pull request Apr 17, 2023
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants