[RLlib] SimpleQ PolicyV2 (sub-classing). #25871

sven1977 · 2022-06-16T21:26:56Z

SimpleQ PolicyV2 (sub-classing).

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

gjoliver

looks really good. thanks.
just 1 question, and 1 unit test seems to fail. should be easy to fix:

ImportError: cannot import name 'SimpleQTFPolicy' from 'ray.rllib.algorithms.simple_q.simple_q_tf_policy'

gjoliver · 2022-06-17T01:29:06Z

rllib/policy/tf_policy.py

-                if SampleBatch.PREV_REWARDS in input_dict:
-                    builder.add_feed_dict(
-                        {self._prev_reward_input: input_dict[SampleBatch.PREV_REWARDS]}
+        if hasattr(self, "_input_dict"):


can you explain the changes a little bit?
It seems you removed the state_batches from feed_dict, and I am not sure why.

It's soft-deprecated since the trajectory view API was introduced. You shouldn't call this method anymore with anything other than an input_dict, such that you have flexibility as to the different field names, e.g. you could have a "prev_10_actions" in your input_dict (was not supported before due to hardcoded prev_action|reward schema).

wow, ok, should have been deprecated long time ago then.
cool.

sven1977 · 2022-06-17T06:47:24Z

Sorry, I had forgotten to push the tf fixes. This is done now. Let's wait for tests to pass, then I'll ping you again. Thanks for reviewing @gjoliver .

…leq_dqn_policy_sub_classes_2nd_attempt

gjoliver

this is good! that cql test is not related. I can't figure out why test_cql times out in ci.

wip

db9ca25

sven1977 requested review from gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners June 16, 2022 21:26

sven1977 assigned gjoliver Jun 16, 2022

LINT

0533d23

gjoliver reviewed Jun 17, 2022

View reviewed changes

wip

94d34c7

wip

a12c454

sven1977 mentioned this pull request Jun 17, 2022

[RLlib] DQN PolicyV2 sub-classing. #25887

Closed

6 tasks

Merge branch 'master' of https://github.com/ray-project/ray into simp…

f0205f1

…leq_dqn_policy_sub_classes_2nd_attempt

gjoliver approved these changes Jun 17, 2022

View reviewed changes

sven1977 merged commit d90c6cf into ray-project:master Jun 17, 2022

sven1977 deleted the simpleq_dqn_policy_sub_classes_2nd_attempt branch June 2, 2023 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] SimpleQ PolicyV2 (sub-classing). #25871

[RLlib] SimpleQ PolicyV2 (sub-classing). #25871

sven1977 commented Jun 16, 2022 •

edited

Loading

gjoliver left a comment

gjoliver Jun 17, 2022

sven1977 Jun 17, 2022

gjoliver Jun 17, 2022

sven1977 commented Jun 17, 2022

gjoliver left a comment

[RLlib] SimpleQ PolicyV2 (sub-classing). #25871

[RLlib] SimpleQ PolicyV2 (sub-classing). #25871

Conversation

sven1977 commented Jun 16, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Jun 17, 2022

Choose a reason for hiding this comment

sven1977 Jun 17, 2022

Choose a reason for hiding this comment

gjoliver Jun 17, 2022

Choose a reason for hiding this comment

sven1977 commented Jun 17, 2022

gjoliver left a comment

Choose a reason for hiding this comment

sven1977 commented Jun 16, 2022 •

edited

Loading