Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Cleanup examples/ folder 03. #44559

Merged
merged 22 commits into from
Apr 9, 2024

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Apr 8, 2024

Cleanup examples/ folder (vol. 03).

The planned, final structure for the examples folder of RLlib will be as follows:

rllib/examples/
    algorithms/ (custom algo examples)
    catalogs/ (catalog examples)
    checkpoints/ (examples on exporting models, algos, env-runners, etc.c.)
    connectors/ (examples for custom ConnectorV2)
    curriculum/ (curriculum learning)
    debugging/ (tips and tricks for debugging)
    envs/ (example scripts for env handling AND example classes)
    evaluation/ (how to setup evaluation workers)
    gpus/ (GPU training)
    hierarchical/ (hierarchical RL)
    inference/ (examples on how to do inference after training)
    learners/ (example scripts for Learner handling AND example Learner classes)
    multi_agent/ (multi-agent RL examples and classes)
    offline_rl/ (offline RL examples)
    ray_serve/ (RLlib with Ray Serve)
    ray_tune/ (RLlib with Ray Tune)
    rl_modules/ (example scripts for RLModule handling AND example custom RLModule classes)
  • All scripts that are currently sitting in the root examples directory will move into one of the above sub-folders.
  • Moved scripts that have been part of RLlib for a long time will remain as-is, BUT create error messages pointing the user to the new location. In a few major releases (before 3.0), we will completely remove these stubs.
  • A temporary sub-folder (_old_api_stack) has been created to host only those scripts that show things that are relevant to the old API stack only. This folder will also eventually disappear (before 3.0).
  • The current sub-folders: old_api_stack, catalog, env, export, inference_and_serving, learner, policy, rl_module, and serving will all be removed soon (replaced by the plural naming (e.g. env -> envs) or a cleaner name (e.g. export -> checkpoints) or replaced altogether (e.g. policy)).

TODOs (in several follow-up PRs):

  • Many example scripts (even those already moved into the proper sub-folders) still only work on the old API stack and will have to be "translated" into the new API stack.
  • Some example scripts - for this PR - remain in the root rllib.examples folder for now until we figure out their final location (e.g. parametric_actions_cartpole.py). Most of these are related to action spaces or action manipulation and will most likely move into the connectors folder, but we'll see.

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…nup_examples_folder_03

# Conflicts:
#	rllib/examples/multi_agent/multi_agent_pendulum.py

and wip

Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
@sven1977 sven1977 requested a review from a team as a code owner April 8, 2024 10:59
Signed-off-by: sven1977 <[email protected]>
Copy link
Collaborator

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think that labeling some examples as "only-old-stack" or putting them into a folder "old-stack" helps users to distinguish which examples can be executed in the new stack.

}

class RestoreWeightsCallback(DefaultCallbacks):
def on_algorithm_init(self, *, algorithm: "Algorithm", **kwargs) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this callback actually called before weights synching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. I'll check this and fix, if necessary. ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This callback is called after the initial weights initialization of all workers (from the randomly initialized learners).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this should be fine.

  • First we initialize the algo randomly (and sync all the workers).
  • Then we override only(!) the 1st policy with the given weights.

)


class GuessTheNumberGame(MultiAgentEnv):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a multi-agent environment that uses __common__ infos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. We should make this a separate (new) example script, then.

@@ -0,0 +1,207 @@
# TODO (sven): Move this example script into the new API stack.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaics this example is only for old stack (using exploration). Can we put old stack examples under a specific folder old stack?

I see users trying to use exploration algorithms with our new stack and failing with this combination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we'll have to translate it into the new stack (using RLModule instead of the old Exploration API).

)
.framework(args.framework)
.resources(
# How many GPUs does the local worker (driver) need? For most algos,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put here also an example for mutiple worker learners? This is still ambiguous how exactly to set multiple learner workers and assign them GPUs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO. Sorry, I'm trying to keep this PR completely free of actual code changes.

@@ -0,0 +1,188 @@
# TODO (sven): Move this example script into the new API stack.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well: Is this possible with new stack and can we otherwise put this example under a specific foler named old_stack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a new folder called _old_api_stack, but I would like to only put scripts in there that don't need to be translated to the new API stack b/c whatever is demo'd in these will no longer be supported (or relevant).

All the other scripts currently still on the old stack (that are NOT in that folder) will have to be translated.

I agree with you that we should mark all these differences more clearly.
Suggestion:

  • Old-API stack scripts that will NOT be supported in the future (those in _old_api_stack/): add deprecation_warning to top explaining that these scripts will be deprecated fully (w/o replacement)
  • Old-API stack scripts that we will translated (those NOT in _old_api_stack/): add warning to top explaining that we are about to translate these to the new stack (and will retire/remove their old stack version).
  • New API stack: Leave as-is (this should be the new normal).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be possible. This is not even new stack specific. We just have to use PPO/DQNRLModule instead of PPO/DQNPolicy.

Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
doc/source/rllib/rllib-env.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-models.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-models.rst Outdated Show resolved Hide resolved
sven1977 and others added 4 commits April 9, 2024 07:25
Signed-off-by: sven1977 <[email protected]>
Co-authored-by: angelinalg <[email protected]>
Signed-off-by: Sven Mika <[email protected]>
Signed-off-by: sven1977 <[email protected]>
@sven1977 sven1977 merged commit 1d69833 into ray-project:master Apr 9, 2024
4 of 5 checks passed
@sven1977 sven1977 deleted the cleanup_examples_folder_03 branch April 9, 2024 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants