Skip to content

ConvMixer Weights

Latest
Compare
Choose a tag to compare
@tmp-iclr tmp-iclr released this 09 Oct 18:13
· 10 commits to main since this release

We provide weights for:

  • ConvMixer-1536/20 (k = 9, p = 7)
  • ConvMixer-768/32 (k = 7, p = 7)
    • IMPORTANT: This model used ReLU instead of GELU.
    • Currently, you would need to change nn.GELU() to nn.ReLU() in convmixer.py to use these weights; we will fix this later.
  • ConvMixer-1024/20 (k = 9, p = 14)