Skip to content
Vandertic edited this page Oct 31, 2019 · 1 revision

To train variable komi networks, you have to play games with different komi values.

To do so in a naive way would simply produce a lot of uninteresting, clearly unbalanced and really short games, hence we introduced a branching mechanism for self-play games.

When a new self-play games is stored on the server, the server tosses a coin for each position in the game, and with small probability (currently 0.004) it inserts a new game in the list of required self-plays: the new game will be starting from that position, and with komi corrected to make the game balanced from that position.

To do so, the server reads the comment of the position in the sgf of the originating game, where the first number stored is the final score estimate. The estimate was in fact computed by the client that played the originating game, as the on-line median of the net's alpkt, over the MCTS tree.

The final score estimate is then simply rounded to the nearest integer and added to the komi of the originating game.

Clone this wiki locally