[docs][rllib] Documentation for connectors. #27528

gjoliver · 2022-08-04T23:15:37Z

Why are these changes needed?

Add documentation for RLlib connectors.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

kouroshHakha · 2022-08-05T16:05:54Z

doc/source/rllib/rllib-connector.rst

+This setup is useful for certain multi-agent use cases where individual observations may need to be
+modified based on data from other agents.
+This can also be useful if users need to construct meta-observation, e.g., build a graph as input
+to the policy from individual agent observations.


Probably here you should add a paragraph explaining the ActionConnectorDataType data type.

yeah, good idea, let me add a section about the common data types.

kouroshHakha

Made some minor changes here and there. The high-level feedback: This is a good starting doc for internal dev and stuff, but not very useful for external users. After reading this doc it is still not clear how I would use connectors, or modify the connector pipeline with a custom one? Also the input / output data structures are not documented which would require the user to dig into the code-base to be able to use it.

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

gjoliver · 2022-08-06T23:59:51Z

Added TODOs, and a section about common data types used by these connectors.
PTAL.

gjoliver · 2022-08-15T14:43:59Z

@maxpumperla and @richardliaw, there are more contents I want to add, but can you guys help edit the initial landing page of the connectors feature?
Thanks.

richardliaw · 2022-08-16T16:12:55Z

ah ok

Signed-off-by: Richard Liaw <[email protected]>

doc/source/rllib/rllib-connector.rst

maxpumperla

I really like this as a first doc on connectors. The only thing I'd really like to see is 1-2 very concrete usage examples. Signatures and types are important, too, but it's not entirely clear how I'd use this in practice.

E.g. where exactly do I set "enable_connectors" to "True" etc., how does my algorithm spec change if I use them. The example(s) can be almost trivial, as long as it's clear how to leverage the API. Before and after comparisons (e.g. diffs) would be a plus.

gjoliver · 2022-08-19T08:47:57Z

I really like this as a first doc on connectors. The only thing I'd really like to see is 1-2 very concrete usage examples. Signatures and types are important, too, but it's not entirely clear how I'd use this in practice.

E.g. where exactly do I set "enable_connectors" to "True" etc., how does my algorithm spec change if I use them. The example(s) can be almost trivial, as long as it's clear how to leverage the API. Before and after comparisons (e.g. diffs) would be a plus.

definitely. I plan to add a somewhat e2e example (probably in notebook format) to demonstrate the usage and important benefits (see the TODO section).
just trying to get the first version in, so I can share this with some of the early testers.
thanks for the thoughtful edits.

ArturNiederfahrenhorst · 2022-08-19T10:05:59Z

doc/source/rllib/rllib-connector.rst

+==================
+
+Connector are components that handle transformations on inputs and outputs of a given RL policy, with the goal of improving
+the durabilty and maintainability of RLlib's policy checkpoints.


ArturNiederfahrenhorst · 2022-08-19T10:06:31Z

doc/source/rllib/rllib-connector.rst

+
+By consolidating these transformations under the framework of connectors, users of RLlib will be able to:
+
+- Restore and deploy individual RLlib policies without having to restore training related logics of RLlib Algorithms.


training-related

ArturNiederfahrenhorst · 2022-08-19T10:06:57Z

doc/source/rllib/rllib-connector.rst

+- Restore and deploy individual RLlib policies without having to restore training related logics of RLlib Algorithms.
+- Ensure policies are more durable than the algorithms they get trained with.
+- Allow policies to be adapted to work with different versions of an environment.
+- Run inference with RLlib polcies without worrying about the exact trajectory view requriements or state inputs.


requirements

ArturNiederfahrenhorst · 2022-08-19T10:07:55Z

doc/source/rllib/rllib-connector.rst

+- Allow policies to be adapted to work with different versions of an environment.
+- Run inference with RLlib polcies without worrying about the exact trajectory view requriements or state inputs.
+
+Connectors can be enabled by setting ``enable_connectors`` parameter to ``True``.


... by setting the enable_connectors parameter to True.

ArturNiederfahrenhorst · 2022-08-19T10:08:25Z

doc/source/rllib/rllib-connector.rst

+~~~~~~~~~~~~~~
+
+``AgentConnectors`` handle the job of transforming environment observation data into a format that is understood by
+the policy (e.g., flattening complex nested observations into a flat tensor). The high level APIs are:


ArturNiederfahrenhorst · 2022-08-19T10:08:32Z

doc/source/rllib/rllib-connector.rst

+-------------------
+
+Lambda Connectors helps turn simple transformation functions into agent or action
+connectors without having users worry about the high level list or non-list APIs.


doc/source/rllib/rllib-connector.rst

ArturNiederfahrenhorst · 2022-08-19T10:11:54Z

doc/source/rllib/rllib-connector.rst

+Advanced Connectors
+-------------------
+
+Lambda Connectors helps turn simple transformation functions into agent or action


Not sure about the plural s here. Shouldn't it be "Lambda Connectors help ..." or "The Lambda Connector helps ..."?

ArturNiederfahrenhorst · 2022-08-19T10:12:20Z

doc/source/rllib/rllib-connector.rst

+
+Lambda Connectors helps turn simple transformation functions into agent or action
+connectors without having users worry about the high level list or non-list APIs.
+Lambda Connectors has separate agent and action versions, for example:


Same as above, not sure about singular/plural s here.

ArturNiederfahrenhorst · 2022-08-19T10:12:45Z

doc/source/rllib/rllib-connector.rst

+            lambda actions, states, fetches: 2 * actions, states, fetches
+        )
+
+Mutiple connectors can be composed into a ``ConnectorPipeline``, which handles


ArturNiederfahrenhorst · 2022-08-19T10:13:17Z

doc/source/rllib/rllib-connector.rst

+
+If connectors are enabled, RLlib will try to save policy checkpoints in properly serialized formats instead of
+relying on python pickling. Eventually, the goal is to save policy checkpoints in serialized JSON files
+to ensure maximum compatiiblity between RLlib and python versions.


compatibility

ArturNiederfahrenhorst · 2022-08-19T10:13:36Z

doc/source/rllib/rllib-connector.rst

+When enabled, the configurations of agent and action connectors will get serialized and saved with checkpointed
+policy states.
+These connectors, together with the specific transformations they represent,
+can be easily recovered (by RLlib provided utils) to simplify deployment and inference use cases.


RLlib-provided

ArturNiederfahrenhorst · 2022-08-19T10:14:12Z

doc/source/rllib/rllib-connector.rst

+Adapting a Policy for Different Environments
+--------------------------------------------
+
+It not uncommon for user environments to go through active development iterations.


ArturNiederfahrenhorst · 2022-08-19T10:14:44Z

doc/source/rllib/rllib-connector.rst

+It not uncommon for user environments to go through active development iterations.
+Policies trained with an older version of an environment may be rendered useless for updated environments.
+While env wrapper helps with this problem in many cases, connectors allow policies trained with
+different environments to work together at a same time.


... together at the same time. would be correct I think.

doc/source/rllib/rllib-connector.rst

ArturNiederfahrenhorst · 2022-08-19T10:27:09Z

doc/source/rllib/rllib-connector.rst

+    :align: center
+
+We have two classes of connectors. The first is an ``AgentConnector``, which is used to transform observed data from environments to the policy.
+The second is an ``ActionConnector``, which is used to transform the action data from the policy to actions.


I'd like "... transform the outputs of the policy into actions." better.

ArturNiederfahrenhorst · 2022-08-19T10:41:21Z

doc/source/rllib/rllib-connector.rst

+While env wrapper helps with this problem in many cases, connectors allow policies trained with
+different environments to work together at a same time.
+
+Here is an example demonstrating adaptation of a policy trained for the standard Cartopole environment


Cartpole or CartPole-v0 or something! One o too much here :)

ArturNiederfahrenhorst

Great stuff. I think this already provides users with a good sense of what they are facing. A notebook to enable them to play with connectors and pipelines would be awesome. Observing input/output per transform would be great here.

Signed-off-by: Jun Gong <[email protected]>

Documentation for connectors.

be5f6e4

gjoliver assigned kouroshHakha Aug 4, 2022

gjoliver requested review from sven1977, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha, krfricke, pcmoritz, ericl, stephanie-wang, dmatrix, sriram-anyscale and richardliaw as code owners August 4, 2022 23:15

kouroshHakha reviewed Aug 5, 2022

View reviewed changes

kouroshHakha and others added 3 commits August 5, 2022 09:30

modified some minor stuff and some wording

0a01459

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

lint

1ca88b3

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

Add section about common data types.

d6e10dc

gjoliver requested a review from a team as a code owner August 6, 2022 23:59

lint

4b5d3db

stephanie-wang changed the title ~~Documentation for connectors.~~ [docs][rllib] Documentation for connectors. Aug 8, 2022

stephanie-wang added the copyediting-required label Aug 8, 2022

gjoliver assigned maxpumperla and richardliaw Aug 15, 2022

Merge branch 'master' into connector-doc

ca2a613

Signed-off-by: Richard Liaw <[email protected]>

maxpumperla reviewed Aug 19, 2022

View reviewed changes

doc/source/rllib/rllib-connector.rst Show resolved Hide resolved

maxpumperla reviewed Aug 19, 2022

View reviewed changes

ArturNiederfahrenhorst reviewed Aug 19, 2022

View reviewed changes

doc/source/rllib/rllib-connector.rst Show resolved Hide resolved

ArturNiederfahrenhorst reviewed Aug 19, 2022

View reviewed changes

doc/source/rllib/rllib-connector.rst Show resolved Hide resolved

ArturNiederfahrenhorst reviewed Aug 19, 2022

View reviewed changes

address artur's comments.

89c7cca

Signed-off-by: Jun Gong <[email protected]>

kouroshHakha approved these changes Aug 19, 2022

View reviewed changes

richardliaw merged commit 62b91cb into ray-project:master Aug 19, 2022


		By consolidating these transformations under the framework of connectors, users of RLlib will be able to:

		- Restore and deploy individual RLlib policies without having to restore training related logics of RLlib Algorithms.

[docs][rllib] Documentation for connectors. #27528

[docs][rllib] Documentation for connectors. #27528

Conversation

gjoliver commented Aug 4, 2022 • edited by zhe-thoughts Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

gjoliver commented Aug 6, 2022

gjoliver commented Aug 15, 2022

richardliaw commented Aug 16, 2022

maxpumperla left a comment

Choose a reason for hiding this comment

gjoliver commented Aug 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArturNiederfahrenhorst left a comment

Choose a reason for hiding this comment

gjoliver commented Aug 4, 2022 •

edited by zhe-thoughts

Loading