Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Preparatory PR: Make EnvRunners use (enhanced) Connector API (#01: mostly cleanups and small fixes) #41074

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Nov 10, 2023

Preparatory PR: Make EnvRunners use (enhanced) Connector API (#1: mostly cleanups and small fixes)

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>
@@ -1932,7 +1932,10 @@ def compute_actions(
filtered_obs, filtered_state = [], []
for agent_id, ob in observations.items():
worker = self.workers.local_worker()
preprocessed = worker.preprocessors[policy_id].transform(ob)
if worker.preprocessors.get(policy_id) is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix.

@@ -319,26 +319,37 @@ def __init__(self, algo_class=None):
# If not specified, we will try to auto-detect this.
self._is_atari = None

# TODO (sven): Rename this method into `AlgorithmConfig.sampling()`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are aiming for a the EnvRunner API as the default, we should rename/clarify some of these config settings and methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider loading a checkpoint here? Are these renaming backward compatible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there even a story around this? Like can people even move from rllib 2+ to 3?

@@ -285,30 +285,31 @@ def __init__(self, config: RecurrentEncoderConfig) -> None:
bias=config.use_bias,
)

self.state_in_out_spec = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified (repetitive) code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this private attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -212,75 +212,75 @@ def get_observations(

return self._getattr_by_index("observations", indices, global_ts)

def get_actions(
def get_infos(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reordered:

  1. obs, infos (<- env.reset data)
  2. action, reward, terminated/truncated (<- other env.step results)
  3. extra model outs

gym.register(
"custom-env-v0",
partial(
if (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug fix.

@@ -690,6 +690,9 @@ def foreach_worker(
if local_worker and self.local_worker() is not None:
local_result = [func(self.local_worker())]

if not self.__worker_manager.actor_ids():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shortcut for local-worker only case.

@@ -30,7 +30,7 @@ multi-agent-cartpole-crashing-appo:
# Switch on resiliency for failed sub environments (within a vectorized stack).
restart_failed_sub_environments: true

# Switch on evaluation workers being managed by AsyncRequestsManager object.
# Switch on asynchronous handling of evaluation workers.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncRequestsManager doesn't exist anymore.

@@ -205,6 +205,56 @@ def flatten_to_single_ndarray(input_):
return input_


@DeveloperAPI
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful new utility. Inverse of already existing unbatch utility.

@@ -319,26 +319,37 @@ def __init__(self, algo_class=None):
# If not specified, we will try to auto-detect this.
self._is_atari = None

# TODO (sven): Rename this method into `AlgorithmConfig.sampling()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider loading a checkpoint here? Are these renaming backward compatible?

@@ -319,26 +319,37 @@ def __init__(self, algo_class=None):
# If not specified, we will try to auto-detect this.
self._is_atari = None

# TODO (sven): Rename this method into `AlgorithmConfig.sampling()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there even a story around this? Like can people even move from rllib 2+ to 3?

@@ -285,30 +285,31 @@ def __init__(self, config: RecurrentEncoderConfig) -> None:
bias=config.use_bias,
)

self.state_in_out_spec = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this private attribute?

@@ -205,6 +205,56 @@ def flatten_to_single_ndarray(input_):
return input_


@DeveloperAPI
def batch(list_of_structs, individual_items_already_have_batch_1: bool = False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data types please (for input and output)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have unittest of this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also enhanced the docstring to make the example and explanations more clear.

flat = [[] for _ in range(len(flattened_item))]
for i, value in enumerate(flattened_item):
flat[i].append(value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add:

    if item is None:
        raise ValueError("Input list_of_structs does not contain valid structs.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

in this struct represents the batch for a single component
(in case struct is tuple/dict). Alternatively, a simple batch of
primitives (non tuple/dict) might be returned.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add

    if not list_of_structs:
        raise ValueError("Input list_of_structs is empty.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: sven1977 <[email protected]>
@sven1977
Copy link
Contributor Author

Thanks for the review @kouroshHakha ! Waiting for tests to pass ...

@sven1977 sven1977 merged commit ca29fec into ray-project:master Nov 17, 2023
14 of 15 checks passed
@sven1977 sven1977 deleted the env_runner_support_connectors_01_minor_cleanups branch November 17, 2023 11:29
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants