Fix: Keep reward shape and dtype the same when resetting and stepping #6

RuanJohn · 2024-01-16T08:53:02Z

What

Due to default behaviour in Jumanji the reward was set to a single float value when the environment was reset, but when stepping num_agent int rewards are returned. This PR fixes this by passing in num_agents as the shape argument to the restart, termination and transition methods in Jumanji.

Extra

Added a new test that checks that all leaves in the timestep pytree have the same shapes and data types when resetting and stepping the environment.
Since discounts will now also have shape (num_agents, ) a test checking the discount shapes was also updated.
Data types of the rewards are also explicitly cast to floats to ensure consistency between stepping and resetting the environment.

… over reset and step

arnupretorius

Thanks @RuanJohn 👍

RuanJohn added 2 commits January 16, 2024 10:31

test: add test to ensure timestep shapes and dtypes remain consistent…

7b0acb6

… over reset and step

feat: pass shape to reset, transition and termination functions

a372c91

RuanJohn added the bug Something isn't working label Jan 16, 2024

RuanJohn self-assigned this Jan 16, 2024

Merge branch 'main' into fix/reset-step-reward-shape

6ca1c9a

arnupretorius approved these changes Jan 16, 2024

View reviewed changes

arnupretorius merged commit 4c5d8aa into main Jan 16, 2024
3 checks passed

RuanJohn deleted the fix/reset-step-reward-shape branch January 16, 2024 13:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Keep reward shape and dtype the same when resetting and stepping #6

Fix: Keep reward shape and dtype the same when resetting and stepping #6

RuanJohn commented Jan 16, 2024 •

edited

Loading

arnupretorius left a comment

Fix: Keep reward shape and dtype the same when resetting and stepping #6

Fix: Keep reward shape and dtype the same when resetting and stepping #6

Conversation

RuanJohn commented Jan 16, 2024 • edited Loading

What

Extra

arnupretorius left a comment

Choose a reason for hiding this comment

RuanJohn commented Jan 16, 2024 •

edited

Loading