-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Add examples and docs for Catalog #33898
[RLlib] Add examples and docs for Catalog #33898
Conversation
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
.. | ||
.. admonition:: This doc is related to the RLModules API. | ||
|
||
Disregard this doc if you do not explicitly want to use RLModules. The RLModules API is experimental and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably just say something to the effect of RLModules is an experimental API. I think this gets across the point overall. Its implied what experimental means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a complete sentence to replace this with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RLModules api is currently in the experimental phase of its rollout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm while we're at it, is this a whole separate doc page right here? If so, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an include. It's basically a banner that alerts users that they are looking at something experimental.
Angelina told me to make this basically the first thing people read - if it's not meant for John Doe, put it first.
CC @kouroshHakha to make you aware of this banner
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
doc/source/rllib/rllib-catalogs.rst
Outdated
Catalogs | ||
======== | ||
|
||
Catalogs are where RL Modules primarily get their models and action distributions from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Catalog
is a tool that helps create default models and action distributions for RLModules to work with environments with any action and observation spaces. It provides an interface for constructing pre-made components like an MLP for policies that have continuous action spaces, and an action distribution (like a multivariate Gaussian) that matches that action space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left several comments / suggestions. I had to overlook a lot of places that I would write / restructure things differently for the sake of time, but we can revisit them later, when reviewing our docs as a whole.
One thing that applies to this PR in general is that there are so many places that you are referencing a class, but there is not cross reference to the class API. So I would suggest creating API reference for catalogs and cross referencing them where ever it is needed. You can take a look at RLModule doc pr to see how many places I have referenced the api ref doc pages. This will immensely help with ease of navigation in your doc pages.
@@ -299,6 +299,7 @@ parts: | |||
- file: rllib/rllib-sample-collection | |||
- file: rllib/rllib-replay-buffers | |||
- file: rllib/rllib-offline | |||
- file: rllib/rllib-catalogs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put this under user-guide for now. It's not a key concept yet since it's experimental. We have to later re-structure the key-concepts page to have better navigation around topics.
doc/source/rllib/rllib-catalogs.rst
Outdated
.. image:: images/catalog/ppo_catalog_and_rlm_diagram.svg | ||
:align: center | ||
|
||
Catalogs and AlgorithmConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this usage pattern should be explained earlier. It's the most common pattern people use after basic API.
doc/source/rllib/rllib-catalogs.rst
Outdated
Catalogs | ||
======== | ||
|
||
Catalogs are where RL Modules primarily get their models and action distributions from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to mention first who the audience is. e.g. Catalog is for advanced users who want to extend existing RLModules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhh. I do this 5 lines below.
I don't see it being the first sentence in the RLModules docs either: https://docs.ray.io/en/master/rllib/rllib-rlmodule.html
.. or in other docs we have.
Why would it be different here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put it second or third? Like, the first sentence should be a very general, high-level description of what user is looking at I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catalog is a secondary concept compared to RLModules and a lot of users (high-level API users) do not need to know about it unless they want to do some sort of extension to RLModules, that's why I think this is different.
You can add it in a note format, similar to how the initial note is structured in RLModules.
Something like
Note: This abstraction is intended for advanced users who are interested in extending the functionality of build-in ~ray.rllib.core.rl_module.rl_module.RLModule
objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I agree.
Do you think we should add a similar matrix here?
Because afaics, 4 of the 5 columns here of at are actually Catalog features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think it should be enough to have this matrix with RLModules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally. I didn't mean to use this text, I meant to use the theme, i.e. the ..note::
syntax.
doc/source/rllib/rllib-catalogs.rst
Outdated
:start-after: __sphinx_doc_begin__ | ||
:end-before: __sphinx_doc_end__ | ||
|
||
.. image:: images/catalog/catalog_rlmspecs_diagram.svg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any utility / extra clarification from this diagram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for illustration, not for extra clarification.
When skimming docs, the majority of information I ingest is through pictures.
Basically, users can get an idea of what they are looking at just by looking at the pictures to filter information without putting in lots of effort through reading.
@angelinalg lmk if this is stupid, I'm open to deleting this if I have the wrong idea here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to add some extra caption to get the intention behind the the image across.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I've updated the illustration a little bit and changed the text around it to get the point across that in case of heterogeneous agents, you'll want to modify the multi-agent rl module spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally it's good to have diagrams for visual learners.
doc/source/rllib/rllib-catalogs.rst
Outdated
You can make Catalog build custom models by overriding the Catalog’s methods used by RL Modules to build models. | ||
Have a look at these lines from the constructor of the PPORLModules to see how Catalogs are being used by RLModules: | ||
|
||
.. literalinclude:: ../../../rllib/algorithms/ppo/ppo_base_rl_module.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should ideally not reference ppo_base_rl_module.py
for this part. Just create a ..code-block that shows the part that is important. This will be very restrictive for code development. I don't want to worry about the docs when changing the code structure in PPORLModule
. You can also notice that the indentation is awkward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaics, this collides with what Max said here: #33909 (comment)
@maxpumperla do you have a rule here? Or am I misunderstanding on of you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I'm thinking about the risk that code outdates quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is a bit different. We want self-contained example code snippets to be tested all the time. But this is more like a high-lighted view of somewhere in the code-base. I think we should just snap that view out and put it in a self-contained context for the purpose of the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the problem persists. We state here that the following is a snippet from the constructor of the PPORLModule.
If it is not, but rather a copy of it, the code will be outdated as soon as the constructor is changed.
doc/source/rllib/rllib-catalogs.rst
Outdated
- How to write a custom action distribution | ||
- How to inject a custom action distribution into a Catalog | ||
|
||
.. literalinclude:: ../../../rllib/examples/catalog/custom_action_distribution.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a basic usage, we can keep it in examples/catalog folder.
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Sorry about closing that by mistake, @ArturNiederfahrenhorst |
Co-authored-by: angelinalg <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@kouroshHakha I'm making this change requested by Angelina within the scope of this PR. |
Co-authored-by: angelinalg <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Co-authored-by: angelinalg <[email protected]> Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@ArturNiederfahrenhorst, I did a quick pass and left some minor comments. I really like your diagrams! |
Awesome, thanks you so much! |
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: elliottower <[email protected]>
Signed-off-by: Jack He <[email protected]>
Why are these changes needed?
This PR adds examples and docs for RLlib's Catalog class which is used to create Models and distributions for RL Modules.