[Bug][RLlib] "Samples must consistently provide or omit max_seq_len" with PolicyServerInput/PolicyClient and RNNs #23639
Labels
bug
Something that is supposed to be working; but isn't
P1
Issue that should be fixed within a few weeks
rllib
RLlib related issues
rllib-client-server
Issue related to RLlib's client/server API.
Search before asking
Ray Component
RLlib
Issue Severity
High: It blocks me to complete my task.
What happened + What you expected to happen
There is a recurrent issue with PolicyClient/Server and (at least) IMPALA + RNN model & R2D2. The error does not appear on PPO:
This check in ray\rllib\policy\sample_batch.py is causing the issue:
because max_seq_len is None (s.max_seq_len is fine).
Removing the check does not bring any issue and training runs.
Versions / Dependencies
Windows 10
Python 3.9.7
ray==1.11.0 (should also appear on master branch)
The issue appears with ray > 1.9.2.
Reproduction script
I modified the cartpole_server script provided in the examples to run with lstm & other algorithms. I also changed the hyperparameters for fast reproduction.
cartpole_server.py
cartpole_client.py:
run in 2 separate terminals
NOTE: the issue also appears with --run R2D2, PPO is fine
Anything else
Related issue: #20704
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: