This repo is the official implementation of our ICLR2022 paper "AS-MLP: An Axial Shifted MLP Architecture for Vision" (arXiv).
Network | Resolution | Top-1 (%) | Params | FLOPs | Throughput (image/s) | model |
---|---|---|---|---|---|---|
AS-MLP-T | 224x224 | 81.3 | 28M | 4.4G | 1047 | onedrive |
AS-MLP-S | 224x224 | 83.1 | 50M | 8.5G | 619 | onedrive |
AS-MLP-B | 224x224 | 83.3 | 88M | 15.2G | 455 | onedrive |
- Clone this repo:
git clone https://github.com/svip-lab/AS-MLP
cd AS-MLP
- Create a conda virtual environment and activate it:
conda create -n asmlp python=3.7 -y
conda activate asmlp
- Install
CUDA==10.1
withcudnn7
following the official installation instructions - Install
PyTorch==1.7.1
andtorchvision==0.8.2
withCUDA==10.1
:
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch
- Install
timm==0.3.2
:
pip install timm==0.3.2
- Install
cupy-cuda101
:
pip install cupy-cuda101
- Install
Apex
:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
- Install other requirements:
pip install opencv-python==4.4.0.46 termcolor==1.1.0 yacs==0.1.8
To evaluate a pre-trained AS-MLP
on ImageNet val, run:
bash train_scripts/test.sh
To train a AS-MLP
on ImageNet from scratch, run:
bash train_scripts/train.sh
You can easily reproduce our results. Enjoy!
To measure the throughput, run:
bash train_scripts/get_throughput.sh
If this project is helpful for you, you can cite our paper:
@InProceedings{Lian_2021_ASMLP,
title={AS-MLP: An Axial Shifted MLP Architecture for Vision},
author={Lian, Dongze and Yu, Zehao and Sun, Xing and Gao, Shenghua},
booktitle={International Conference on Learning Representations (ICLR)},
year={2022}
}
Object Detection and Instance Segmentation: See AS-MLP for Object Detection.
Semantic Segmentation: See AS-MLP for Semantic Segmentation.
The code is built upon Swin-Transformer, the cuda kernel is modified from Involution.