Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add stochastic muzero implementation #77

Open
ipsec opened this issue May 8, 2024 · 6 comments · May be fixed by #78
Open

[FEATURE] Add stochastic muzero implementation #77

ipsec opened this issue May 8, 2024 · 6 comments · May be fixed by #78
Labels
enhancement New feature or request Roadmap On the roadmap and will be addressed in time

Comments

@ipsec
Copy link

ipsec commented May 8, 2024

Add stochastic muzero implementation - paper and the pseudocode

With this improved version of muzero the stoic could be able to train stochastic environments like the 2048 game and poker (leduc poker)

@ipsec ipsec added the enhancement New feature or request label May 8, 2024
@ipsec ipsec changed the title [FEATURE] [FEATURE] Add stochastic muzero implementation May 9, 2024
@EdanToledo EdanToledo added the Roadmap On the roadmap and will be addressed in time label May 9, 2024
@EdanToledo
Copy link
Owner

Hey, this is on the roadmap however i dont have any immediate plans to implement this. If you'd like to give it a shot, id be more than happy to review it and assist with development. otherwise, it might be a while until this is implemented.

@ipsec
Copy link
Author

ipsec commented May 9, 2024

Let me try then. I had a little difficult with the loss function. If you could help me in this part would be great.

@ipsec ipsec linked a pull request May 15, 2024 that will close this issue
@ipsec
Copy link
Author

ipsec commented May 15, 2024

@EdanToledo PR #78 created.
Like said, I have difficult with the loss function, a good revision is necessary.

@EdanToledo
Copy link
Owner

Hey, I havent forgotten about this. Sorry its an important PR and will hopefully get to it asap.

@EdanToledo EdanToledo linked a pull request Jun 15, 2024 that will close this issue
@ipsec
Copy link
Author

ipsec commented Sep 5, 2024

Hey Edan, could I help you in another point to get this implemented?

Regards.

@EdanToledo
Copy link
Owner

Hey Fernando, I'm sorry about the delay, I just haven't had time to complete something like this. Stochastic MuZero is a non-trivial algorithm that i would need to gain a good understanding of to ensure the algorithm is implemented correctly. Currently, I havent had too much time to do non-priority features. I promise i will get around to this at some point but i really dont have an ETA. Ideally, if there was more contributors and maintainers to this project it would be easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Roadmap On the roadmap and will be addressed in time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants