We trained Mava’s recurrent systems on eight SMAX scenarios. The outcomes were then compared to the final win rates reported by Rutherford et al., 2023. To ensure fair comparisons we also train Mava's system up to 10 million timesteps with 64 vectorised environments.
2s3z |
3s_vs_5z |
3s5z_vs_3s6z |
3s5z |
5m_vs_6m |
6h_vs_8z |
10m_vs_11m |
27m_vs_30m |