Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How exactly does Alphazero's MCTS work? #64

Closed
SheldonCurtiss opened this issue Aug 17, 2021 · 1 comment
Closed

How exactly does Alphazero's MCTS work? #64

SheldonCurtiss opened this issue Aug 17, 2021 · 1 comment

Comments

@SheldonCurtiss
Copy link

Is it directly simulating future boards or is it simulating predicted future boards if that makes sense?

My understanding is it's directly simulating future boards is that correct?

@jonathan-laurent
Copy link
Owner

The question is too vague to answer precisely.
AlphaZero's MCTS uses a perfect simulator of the environment to plan possible future scenarios but which scenarios are explored still depends on the neural network's heuristics.
In contrast, MuZero does not have access to an environment simulator during planning and explores futures scenarios using a learned state-transition model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants