Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add support for efficient recurrent models #54

Open
6 tasks
smorad opened this issue Apr 4, 2024 · 1 comment
Open
6 tasks

[FEATURE] Add support for efficient recurrent models #54

smorad opened this issue Apr 4, 2024 · 1 comment
Labels
enhancement New feature or request Roadmap On the roadmap and will be addressed in time

Comments

@smorad
Copy link

smorad commented Apr 4, 2024

Feature

Revisiting Recurrent Reinforcement Learning with Memory Monoids provides a method to combine recurrent models with standard, nonrecurrent RL losses. This should provide support for S5, LRU, FFM, Linear Transformer, etc recurrent models in stoix. Note that models like LSTM or GRU would be intractable under this paradigm, and would require significantly more effort to integrate into stoix.

Proposal

  • Implement an abstract base class for recurrent models, with map_to_recurrent_state, map_from_recurrent_state, parallel_recurrent_update, initial_recurrent_state, and identity_element methods.
  • Implement a general-purpose episodic reset operator using initial_recurrent_state and identity_element methods
  • Implement one or more of S5, LRU, FFM based on the base class
  • Create a make_recurrent_loss function that wraps a non-recurrent loss function
    • This method will scan over one long contiguous series of observations from the replay buffer, produce Markov states, and feed Markov states into the wrapped non-recurrent loss function
  • Demonstrate sequence model + DQN on one or two POPGym tasks

Testing

Show that it can solve a few POPGym tasks

Benchmarking (Optional)

Definition of done

We have an example that can solve a few POPGym tasks

Mandatory checklist before making a PR

  • The success criteria laid down in “Definition of done” are met.
  • Code is documented - docstrings for methods and classes, static types for arguments.
  • Code is tested - unit, integration and/or functional tests are added.
  • Documentation is updated - README, CONTRIBUTING, or other documentation.
  • All functional tests are green.
  • Link experiment/benchmarking after implementation (optional).

Links / references / screenshots

@smorad smorad added the enhancement New feature or request label Apr 4, 2024
@EdanToledo EdanToledo added the Roadmap On the roadmap and will be addressed in time label Apr 19, 2024
@EdanToledo
Copy link
Owner

On the roadmap and seems very exciting to have implemented. Any help would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Roadmap On the roadmap and will be addressed in time
Projects
None yet
Development

No branches or pull requests

2 participants