[RLlib] Fix wrong env
being passed into on_episode_end
callback on MultiAgentEnvRunner when sampling whole episodes.
#118
Job | Run time |
---|---|
2s | |
2s |