[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken and not caught by tests!) #30526

sven1977 · 2022-11-21T13:14:54Z

Fix Policy server/client (currently broken and not caught by tests!)

Fixes broken local inference RolloutWorker creation on a) client side (mode "local") or b) server side (mode "remote").
Enhances test cases to test for learning success. The current tests don't even notice if one or more client processes crash due to a failed connection/failed server.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <[email protected]>

…cy_server_enhancements

Signed-off-by: sven1977 <[email protected]>

…cy_server_enhancements

avnishn · 2022-11-21T15:24:22Z

rllib/env/policy_client.py

+    kwargs["config"] = kwargs["config"].copy(copy_frozen=False)
+    config = kwargs["config"]
+    config.output = None
+    config.input_ = "sampler"


lgtm. Was thinking that users might need access to other input types but I guess why would anyone use policy_client if they weren't doing environment sampling

Yeah, that, too. But also note this is only for the extra RolloutWorker that we create on either the client (inference-mode=local) or on the server (inside the PolicyServerInput object!) for inference mode=remote.
The whole design is completely flawed imo, but we'll have to fix this separately, it's beyond the scope of this quick fix PR. This PR does NOT touch the original (bad) design.

avnishn · 2022-11-21T15:27:21Z

rllib/env/policy_server_input.py

@@ -64,7 +65,7 @@ class PolicyServerInput(ThreadingMixIn, HTTPServer, InputReader):
    """

    @PublicAPI
-    def __init__(self, ioctx, address, port, idle_timeout=3.0):
+    def __init__(self, ioctx, address, port, idle_timeout=3.0, use_json=False):


you've added this as an option, but is this used at any point now in the tests/examples?

Good catch, will remove.
Leftover from my other work, which made me discover this bug in the first place :)

avnishn · 2022-11-21T15:35:12Z

you mention that "The current tests don't even notice if one or more client processes crash due to a failed connection/failed server."

but I'm not sure how the tests as they are rebuffed handle this issue.
Can you comment on the lines where this is handled in your pr?

…cy_server_enhancements

Signed-off-by: sven1977 <[email protected]>

kouroshHakha

left a couple of questions

kouroshHakha · 2022-11-21T17:25:19Z

rllib/env/policy_server_input.py

@@ -64,7 +65,7 @@ class PolicyServerInput(ThreadingMixIn, HTTPServer, InputReader):
    """

    @PublicAPI
-    def __init__(self, ioctx, address, port, idle_timeout=3.0):
+    def __init__(self, ioctx, address, port, idle_timeout=3.0, use_json=False):


kouroshHakha · 2022-11-21T17:27:13Z

rllib/env/policy_server_input.py

                setup_child_rollout_worker()
                assert inference_thread.is_alive()
                response["episode_id"] = child_rollout_worker.env.start_episode(
                    args["episode_id"], args["training_enabled"]
                )
-            elif command == Commands.GET_ACTION:
+            elif (


This is very hacky :). Isn't command and Commands.GET_ACTION always string?

use StrEnum instead?

kouroshHakha · 2022-11-21T17:36:44Z

rllib/evaluation/rollout_worker.py

@@ -457,6 +455,8 @@ def __init__(
        global _global_worker
        _global_worker = self

+        from ray.rllib.algorithms.algorithm_config import AlgorithmConfig


we need to kill the cyclic dependency chain after release to avoid these types of imports.

kouroshHakha · 2022-11-21T17:36:44Z

rllib/evaluation/rollout_worker.py

@@ -457,6 +455,8 @@ def __init__(
        global _global_worker
        _global_worker = self

+        from ray.rllib.algorithms.algorithm_config import AlgorithmConfig


we need to kill the cyclic dependency chain after release to avoid these types of imports.

kouroshHakha · 2022-11-21T17:39:59Z

rllib/examples/serving/cartpole_server.py

+
+        if args.as_test:
+            print("Checking if learning goals were achieved")
+            check_learning_achieved(results, args.stop_reward)


Is this the main fix for the unittest? check whether learning of min_reward_trheshold was achieved?

kouroshHakha

left a couple of questions

kouroshHakha

approved.

…nd not caught by tests) (ray-project#30526) Signed-off-by: Weichen Xu <[email protected]>

sven1977 added 5 commits November 18, 2022 15:29

wip

18c5b43

Signed-off-by: sven1977 <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into poli…

c8a6a15

…cy_server_enhancements

Merge branch 'master' of https://github.com/ray-project/ray into poli…

624b75b

…cy_server_enhancements

wip

8a36bd7

Signed-off-by: sven1977 <[email protected]>

wip

c64d2c8

Signed-off-by: sven1977 <[email protected]>

sven1977 requested review from gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners November 21, 2022 13:14

LINT

a6d5b17

Signed-off-by: sven1977 <[email protected]>

sven1977 assigned gjoliver Nov 21, 2022

wip

47a7b1a

Signed-off-by: sven1977 <[email protected]>

zhe-thoughts added release-blocker P0 Issue that blocks the release P0 Issues that should be fixed in short order labels Nov 21, 2022

Merge branch 'master' of https://github.com/ray-project/ray into poli…

8d249f6

…cy_server_enhancements

avnishn approved these changes Nov 21, 2022

View reviewed changes

avnishn reviewed Nov 21, 2022

View reviewed changes

sven1977 added 2 commits November 21, 2022 20:27

Merge branch 'master' of https://github.com/ray-project/ray into poli…

9d47649

…cy_server_enhancements

wip

008da5b

Signed-off-by: sven1977 <[email protected]>

kouroshHakha reviewed Nov 21, 2022

View reviewed changes

kouroshHakha approved these changes Nov 21, 2022

View reviewed changes

kouroshHakha added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 21, 2022

sven1977 merged commit ba18004 into ray-project:master Nov 21, 2022

WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022

[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken a…

5b40a53

…nd not caught by tests) (ray-project#30526) Signed-off-by: Weichen Xu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken and not caught by tests!) #30526

[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken and not caught by tests!) #30526

sven1977 commented Nov 21, 2022 •

edited

Loading

avnishn Nov 21, 2022

sven1977 Nov 21, 2022

avnishn Nov 21, 2022

sven1977 Nov 21, 2022

kouroshHakha Nov 21, 2022

avnishn commented Nov 21, 2022

kouroshHakha left a comment

kouroshHakha Nov 21, 2022

kouroshHakha Nov 21, 2022

kouroshHakha Nov 21, 2022

kouroshHakha Nov 21, 2022

kouroshHakha Nov 21, 2022

kouroshHakha Nov 21, 2022

kouroshHakha left a comment

kouroshHakha left a comment

[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken and not caught by tests!) #30526

[RLlib; RELEASE BLOCKER] Fix Policy server/client (currently broken and not caught by tests!) #30526

Conversation

sven1977 commented Nov 21, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avnishn commented Nov 21, 2022

kouroshHakha left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

sven1977 commented Nov 21, 2022 •

edited

Loading