[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks. #19758

sven1977 · 2021-10-26T19:56:10Z

Docstring cleanup: Trainer, trainer_template, Callbacks.

Cleanup docstrings.
Remove type hints in Args list (already in the signature).
Remve type hint from "Returns" (already in signature).
Add docstrings where missing.
Re-sort some class methods (by their importance, private/public, deprecated, etc..).

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

richardliaw · 2021-10-26T20:10:48Z

rllib/agents/callbacks.py

+            env_index: Obsoleted: The ID of the environment, which the
                episode belongs to.


If this is obsolete, we can remove it?

gjoliver

overall a very nice cleanup.
some suggestions.
thanks.

gjoliver · 2021-10-27T05:24:57Z

rllib/agents/trainer.py


-    All RLlib trainers extend this base class, e.g., the A3CTrainer implements
-    the A3C algorithm for single and multi-agent training.
+    Trainers contain a WorkerSet (self.workers), normally used to generate


this is really concise, but I am not sure users would understand remote and local workers if they start here.
should we say something like:

Trainer contain a WorkSet (self.workers). A WorkerSet is normally composed of a single local worker (self.workers.local_worker(), used to compute and apply learning updates), and multiple remote workers (self.workers.remote_workers(), used to generate environment samples in parallel).

gjoliver · 2021-10-27T05:29:53Z

rllib/agents/trainer.py

-    Trainer objects retain internal model state between calls to train(), so
-    you should create a new trainer instance for each training session.
+    Each worker (remotes and local) contains a full PolicyMap (1 Policy
+    for single-agent training, 1 or more policies for multi-agent training).


should we add one quick sentence about how the policies are synced during training? Like:

Each worker ... for multi-agent traiing). For most algorithems, policies are synced automatically using Ray remote calls during training.

gjoliver · 2021-10-27T05:38:27Z

rllib/agents/trainer.py

-                0 for local only.
+    @PublicAPI
+    def train(self) -> ResultDict:
+        """Overrides super.train to synchronize global vars."""



just to confirm, we only moved things around here right? there is no actual logic changes?

Correct, I wanted to make sure that the order of the methods as they show up in the reference documentation makes more sense -> sorted now by importance. Private methods, helper methods, deprecated methods moved more to the end.

gjoliver · 2021-10-27T05:45:34Z

rllib/agents/trainer.py

@@ -1284,7 +1096,7 @@ def compute_actions(
        Returns:
            any: The computed action if full_fetch=False, or
            tuple: The full output of policy.compute_actions() if
-                full_fetch=True or we have an RNN-based Policy.
+            full_fetch=True or we have an RNN-based Policy.


I think we should keep the 4-char indentation, since this line is still about the tuple type?

Yeah, but it messes up the sphinx autoclass output.

gjoliver · 2021-10-27T05:47:51Z

rllib/agents/trainer.py

@@ -1503,7 +1305,7 @@ def export_policy_model(self,
                            export_dir: str,
                            policy_id: PolicyID = DEFAULT_POLICY_ID,
                            onnx: Optional[int] = None):
-        """Export policy model with given policy_id to local directory.
+        """Exports policy model with given policy_id to a local directory.


what is the format of the exported model if onxx param is None?

gjoliver · 2021-10-27T05:49:59Z

rllib/agents/trainer.py

+        self.__setstate__(extra_data)
+
+    @override(Trainable)
+    def log_result(self, result: ResultDict):


can we explain what does log_result do???

I have no idea :D
I'll add the docstring
+1

Well, actually, this overrides (super) Trainable's method, so we shouldn't add a docstring here. I'll add some comments.

gjoliver · 2021-10-27T05:50:33Z

rllib/agents/trainer.py

+        """Pre-evaluation callback."""
+        pass
+
+    @DeveloperAPI


should this live at the top right after init() maybe?

gjoliver · 2021-10-27T05:54:23Z

rllib/agents/trainer_template.py

+            the policy class or None. If None is returned, will use
+            `default_policy` (which must be provided then).
+        validate_env: Optional callable to validate the generated environment
+            (only on worker=0).


can we expand a little?
I would be wondering what only validate worker=0?

Great catch, this validation happens on all workers.
The local worker may not even have an env (by default it doesn't if we have >0 remote workers!).

I don't see update here?

…_docstrings_cleanup_trainer

sven1977 · 2021-10-27T09:41:46Z

Hey @gjoliver , thanks for the review. I addressed all requested changes.

gjoliver

Thanks. One last question, but this looks much better.

gjoliver · 2021-10-27T15:47:13Z

rllib/agents/trainer_template.py

+            the policy class or None. If None is returned, will use
+            `default_policy` (which must be provided then).
+        validate_env: Optional callable to validate the generated environment
+            (only on worker=0).


I don't see update here?

sven1977 · 2021-10-27T17:16:26Z

Oh, no, sorry, forgot to address the validate_env question. Will put this into a follow up PR (many more doc-overhaul ones to come ;) ).

…mplate, Callbacks. (#19758)" This reverts commit 80eeb13.

…mplate, Callbacks. (#19758)" (#19806) This reverts commit 80eeb13.

…ainer_template, Callbacks. (#19758)" (#19806)" This reverts commit 4a82d3e.

sven1977 added 2 commits October 26, 2021 21:55

wip.

3034727

wip.

849c9b7

sven1977 assigned gjoliver Oct 26, 2021

sven1977 requested a review from gjoliver October 26, 2021 19:56

richardliaw reviewed Oct 26, 2021

View reviewed changes

gjoliver reviewed Oct 27, 2021

View reviewed changes

sven1977 added 2 commits October 27, 2021 11:08

Merge branch 'master' of https://github.com/ray-project/ray into docs…

d3363bb

…_docstrings_cleanup_trainer

wip.

c701ae3

gjoliver approved these changes Oct 27, 2021

View reviewed changes

sven1977 merged commit 80eeb13 into ray-project:master Oct 27, 2021

sven1977 added a commit that referenced this pull request Oct 27, 2021

Revert "[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_te…

c6afb19

…mplate, Callbacks. (#19758)" This reverts commit 80eeb13.

sven1977 mentioned this pull request Oct 27, 2021

Revert "[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks." #19806

Merged

sven1977 added a commit that referenced this pull request Oct 27, 2021

Revert "[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_te…

4a82d3e

…mplate, Callbacks. (#19758)" (#19806) This reverts commit 80eeb13.

sven1977 added a commit that referenced this pull request Oct 28, 2021

Revert "Revert "[RLlib; Docs overhaul] Docstring cleanup: Trainer, tr…

2fa2e7d

…ainer_template, Callbacks. (#19758)" (#19806)" This reverts commit 4a82d3e.

scv119 mentioned this pull request Nov 2, 2021

[Nightly Release] pbt is failing with "Failed to stop remote worker" #19855

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks. #19758

[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks. #19758

sven1977 commented Oct 26, 2021 •

edited

Loading

richardliaw Oct 26, 2021

sven1977 Oct 27, 2021

gjoliver left a comment

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 Oct 27, 2021

sven1977 Oct 27, 2021

gjoliver Oct 27, 2021

sven1977 commented Oct 27, 2021

gjoliver left a comment

gjoliver Oct 27, 2021

sven1977 commented Oct 27, 2021

		env_index: Obsoleted: The ID of the environment, which the
		episode belongs to.

[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks. #19758

[RLlib; Docs overhaul] Docstring cleanup: Trainer, trainer_template, Callbacks. #19758

Conversation

sven1977 commented Oct 26, 2021 • edited Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gjoliver left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 commented Oct 27, 2021

gjoliver left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 commented Oct 27, 2021

sven1977 commented Oct 26, 2021 •

edited

Loading