Mindspore implementation for Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-identification in CVPR 2021. Please read our paper for a more detailed description of the training procedure. You can also read Pytorch Version for further reference.
(1) SYSU-MM01 Dataset [1]: The SYSU-MM01 dataset can be obtained from this website.
run python pre_process_sysu.py
(in DDAG_mindspore/third_party/pre_process_sysu.py
) in to prepare the dataset, the training data will be stored in ".npy" format.
(2) RegDB Dataset [2]: The RegDB dataset can be downloaded from this website by submitting a copyright form.
(Named: "Dongguk Body-based Person Recognition Database (DBPerson-Recog-DB1)" on their website).
dataset
├──sysu # SYSU-MM01 dataset
│ ├── cam1
│ ├── cam2
│ ├── cam3
│ ├── cam4
│ ├── cam5
│ ├── cam6
│ ├── exp
│ │ ├── available_id.mat
│ │ ├── available_id.txt
│ │ ├── test_id.mat
│ │ ├── test_id.txt
│ │ ├── train_id.mat
│ │ ├── train_id.txt
│ │ ├── val_id.mat
│ │ └── val_id.txt
│ ├── train_ir_resized_img.npy # The following .npy files are generated by pre_process_sysu.py
│ ├── train_ir_resized_label.npy
│ ├── train_rgb_resized_img.npy
│ └── train_rgb_resized_label.npy
└── regdb # RegDB Dataset
├── idx
├── Thermal
└── Visible
Note: For SYSU-MM01 dataset, please first check whether it contains the above 4 .npy
files. If not, please run MVD/third_party/pre_process_sysu.py
to generate .npy
files.
- Hardware
- Support Ascend and GPU environment.
- For Ascend: Ascend 910.
- For GPU: cuda==10.1.
- Framework
- Mindspore=1.5.0(See Installation)
- Third Package
- Python==3.7.5
- psutil*==5.8.0
- tqdm*==4.62.0
*Note: these third party package are not stricted a specific version. For more details, please see requriements.txt
.
Example: SYSU-MM01 dataset training and all search inference
On GPU:
cd MVD/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_standalone_train_sysu_all_ascend.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
On Ascend:
cd MVD/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_standalone_train_sysu_ascend.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
Explanation: [DATASET_PATH]
specifies your own path to SYSU-MM01 or RegDB dataset. For example, if you organize dataset as [above structure](### Recommended Dataset Organization(Example)), then the path should be /.../dataset/sysu
or /.../dataset/regdb
respectively. [CHECKPOINT_PATH]
specify your path to resnet50 pretrain .ckpt
file.
--pretrain
parameter allows you to specify resnet50 checkpoint for pretrain backbone, while --resume
parameter allows you to specify to previously saved whole network checkpoints to resume training.
We use /model_zoo/r1.1/resnet50_ascend_v111_imagenet2012_official_cv_bs32_acc76
pretrain file, and the file link.
For convenience, you can rename it as resnet50.ckpt
and save it directly under MVD/
, then you can leave --pretrain resnet50.ckpt
unchanged.
MVD
├── scripts
│ ├── run_eval_regdb_i2v_ascend.sh # Inference: RegDB dataset infrared to visible on Ascend
│ ├── run_eval_regdb_i2v_gpu.sh # Inference: RegDB dataset infrared to visible on GPU
│ ├── run_eval_regdb_v2i_ascend.sh # Inference: RegDB dataset visible to infrared on Ascend
│ ├── run_eval_regdb_v2i_gpu.sh # Inference: RegDB dataset visible to infrared on GPU
│ ├── run_eval_sysu_all_ascend.sh # Inference: SYSU-MM01 dataset all search on Ascend
│ ├── run_eval_sysu_all_gpu.sh # Inference: SYSU-MM01 dataset all search on GPU
│ ├── run_eval_sysu_indoor_ascend.sh # Inference: SYSU-MM01 dataset indoor search on Ascend
│ ├── run_eval_sysu_indoor_gpu.sh # Inference: SYSU-MM01 dataset indoor search on GPU
│ ├── run_standalone_train_regdb_i2v_ascend.sh # Training: RegDB dataset infrared to visible on Ascend
│ ├── run_standalone_train_regdb_i2v_gpu.sh # Training: RegDB dataset infrared to visible on GPU
│ ├── run_standalone_train_regdb_v2i_ascend.sh # Training: RegDB dataset visible to infrared on Ascend
│ ├── run_standalone_train_regdb_v2i_gpu.sh # Training: RegDB dataset visible to infrared on GPU
│ ├── run_standalone_train_sysu_all_ascend.sh # Training: SYSU-MM01 dataset all search on Ascend
│ ├── run_standalone_train_sysu_all_gpu.sh # Training: SYSU-MM01 dataset all search on GPU
│ ├── run_standalone_train_sysu_indoor_ascend.sh # Training: SYSU-MM01 dataset indoor search on Ascend
│ └── run_standalone_train_sysu_indoor_gpu.sh # Training: SYSU-MM01 dataset indoor search on GPU
├── src
│ ├── dataset.py # class and functions for Mindspore dataset
│ ├── evalfunc.py # for evaluation functions
│ ├── loss.py # loss functions
│ ├── models # network architecture
│ │ ├── mvd.py # main model
│ │ ├── resnet.py
│ │ ├── trainingcell.py # combine loss function, optimizer with network architecture
│ │ └── vib.py # variational information bottleneck
│ └── utils.py
├── third_party
│ └── pre_process_sysu.py # preprocess SYSU-MM01 dataset to generate .npy format files
├── train.py
├── eval.py
├── requirements.txt
└── README.md
if [ $# != 3 ]
then
echo "Usage: $0 [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
PATH1=$(get_real_path $1)
PATH2=$(get_real_path $2)
if [ ! -d $PATH1 ]
then
echo "error: DATASET_PATH=$PATH1 is not a directory"
exit 1
fi
if [ ! -f $PATH2 ]
then
echo "error: CHECKPOINT_PATH=$PATH2 is not a file"
exit 1
fi
ulimit -u unlimited
export DEVICE_NUM=1
export DEVICE_ID=$3
export RANK_SIZE=$DEVICE_NUM
export RANK_ID=0
if [ -d "train" ];
then
rm -rf ./train
fi
mkdir ./train
cp ../*.py ./train
cp *.sh ./train
cp -r ../src ./train
cd ./train || exit
env > env.log
echo "start evaluation for device $DEVICE_ID"
python train.py \
--MSmode GRAPH_MODE \
--dataset SYSU \
--data_path $PATH1 \
--optim adam \
--lr 0.0035 \
--device_target Ascend \
--device_id $DEVICE_ID \
--pretrain $PATH2 \
--tag "sysu_all" \
--loss-func id+tri \
--sysu_mode all \
--epoch 80 \
--print-per-step 100 &> log &
cd ..
The following table describes the most commonly used arguments. You can change freely as you want.
Config Arguments | Explanation |
---|---|
--MSmode |
Mindspore running mode, either 'GRAPH_MODE' or 'PYNATIVE_MODE'. |
--device_target |
choose "GPU", "Ascend" or "Cloud" |
--dataset |
which dataset, "SYSU" or "RegDB". |
--gpu |
which gpu to run(default: 0), only effective when --device_target GPU |
--device_id |
which Ascend AI core to run(default:0), only effective when --device_target Ascend |
--data_path |
manually define the data path(for SYSU , path folder must contain .npy files, see pre_process_sysu.py ). |
--pretrain |
specify resnet-50 pretrain file path(default "" for no ckpt file)* |
--resume |
specify checkpoint file path for whole model(default "" for no ckpt file, --resume loads weights after --pretrain , and thus will overwrite --pretrain weights)* |
--sysu_mode |
choose from ["all", "indoor"] , only effective when args.dataset=SYSU |
--regdb_mode |
choose from ["i2v", "v2i"] , only effective when args.dataset=RegDB |
--save_period |
specify every XXX epochs to save network weights into checkpoint files |
*Note: Please note that mindspore compulsorily requires checkpoint files have .cpkt
as file suffix, otherwise may trigger errors during loading.
We recommend that these following hyper-parameters in .sh
files should be kept by default. If you want ablation study or fine-tuning, feel free to change :)
Config Arguments | Explanation |
---|---|
--optim |
choose "adam" or "sgd"(default adam) |
--lr |
initial learning rate( 0.0035 for adam, 0.1 for sgd) |
--epoch |
the total number of training epochs, by default 80 Epochs(may be different from original paper). |
--warmup_steps |
warm-up strategy, by default 5 |
--start_decay |
the start epoch of lr decay, by default 15. |
--end_decay |
the ending epoch of lr decay , by default 27. |
--loss_func |
for ablation study, by default "id+tri" which is cross-entropy loss plus triplet loss. You can choose from ["id", "id+tri"] . |
For more detailed and comprehensive arguments description, please refer to train.py
.
For GPU:
cd MVD/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_standalone_train_sysu_all_gpu.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
For Ascend:
cd MVD/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_standalone_train_sysu_all_ascend.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
You can replace run_standalone_train_sysu_all_gpu.sh
or run_standalone_train_sysu_all_ascend.sh
with other training scripts.
Training logs and corresponding checkpoint files will be stored in:
- SYSU-MM01 dataset + all search:
/scripts/train_sysu_all/SYSU_train_performance.txt
- SYSU-MM01 dataset + indoor search:
/scripts/train_sysu_indoor/SYSU_train_performance.txt
- RegDB dataset + visible to infrared(v2i):
/scripts/train_regdb_v2i/RegDB_train_performance.txt
- RegDB dataset + infrared to visible(i2v):
/scripts/train_regdb_i2v/RegDB_train_performance.txt
.txt
files are training performance and .ckpt
files are checkpoint files.
At the end of every epoch training, train.py
will use a random testing set (different from training set) to evaluate the model performance. So you will see rank-1 and mAP performance. And this programming pattern of evaluation is analogy to test.py
.
On GPU:
cd MVD/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_eval_sysu_all_gpu.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
On Ascend:
cd DDAG_mindspore/scripts/ # please enter this path before sh XXX.sh, otherwise path errors :)
bash run_eval_sysu_all_ascend.sh [DATASET_PATH] [CHECKPOINT_PATH] [DEVICE_ID]
Explanation: [DATASET_PATH]
specifies your own path to SYSU-MM01 or RegDB dataset, same as [Quick Start](## Quick Start) section. [CHECKPOINT_PATH]
specifies your saved checkpoint files during training, not resnet50.
After running bash run_eval_XXX.sh [DATASET_PATH] [CHECKPOINT_PATH]
, you will get direct inference result.
- SYSU-MM01 dataset + all search:
/scripts/eval_sysu_all/SYSU_train_performance.txt
- SYSU-MM01 dataset + indoor search:
/scripts/eval_sysu_indoor/SYSU_train_performance.txt
- RegDB dataset + visible to infrared(v2i):
/scripts/eval_regdb_v2i/RegDB_train_performance.txt
- RegDB dataset + infrared to visible(i2v):
/scripts/eval_regdb_i2v/RegDB_train_performance.txt
Parameters | Ascend 910 | GPU(RTX Titan) |
---|---|---|
Model Version | MVD: baseline + modal specific & modal share backbone + VIB | MVD: baseline + modal specific & modal share backbone + VIB |
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 | NVIDIA RTX Titan-24G |
uploaded Date | 12/19/2021 (month/day/year) | 12/19/2021 (month/day/year) |
MindSpore Version | 1.3.0, 1.5.0 | 1.3.0, 1.5.0 |
Dataset | SYSU-MM01, RegDB | SYSU-MM01, RegDB |
Training Parameters(SYSU-MM01) | Epochs=80, steps per epoch=695, batch_size = 64 | epoch=80, steps per epoch=64 batch_size = 64 |
Training Parameters(RegDB) | Epochs=80, steps per epoch=695, batch_size = 64 | epoch=80, steps per epoch=64 batch_size = 64 |
Optimizer | Adam | Adam |
Loss Function | Softmax Cross Entropy + Triplet Loss | Softmax Cross Entropy + Triplet Loss |
outputs | feature vector + probability | feature vector + probability |
Loss | 1.7161 | 2.0663 |
Speed | 830 ms/step(1pcs, PyNative Mode) | 940ms/step(1pcs, PyNative Mode) |
Total time | SYSU: about 13h; RegDB: about 3h30min | SYSU: about 14h; RegDB: about 4hmin |
Parameters (M) | 161.9M | 161.9M |
Checkpoint for Fine tuning | 329.2M (.ckpt file) | 329.2M (.ckpt file) |
Scripts | link |
Parameters | Ascend | GPU(RTX Titan) |
---|---|---|
Model Version | MVD: baseline + modal specific & modal share backbone + VIB | MVD: baseline + modal specific & modal share backbone + VIB |
Resource | Ascend 910; OS Euler2.8 | NVIDIA RTX Titan-24G |
Uploaded Date | 12/19/2021 (month/day/year) | 12/19/2021 (month/day/year) |
MindSpore Version | 1.5.0, 1.3.0 | 1.5.0, 1.3.0 |
Dataset | SYSU-MM01, RegDB | SYSU-MM01, RegDB |
batch_size | 64 | 64 |
outputs | feature | feature |
Accuracy | See following 4 tables ↓ |
Metric | Value(Pytorch) | Value(Mindspore, GPU) | Value(Mindspore, Ascend 910) |
---|---|---|---|
Rank-1 | 60.02% | 60.08% | 58.64% |
mAP | 58.80% | 57.55% | 57.57% |
Metric | Value(Pytorch) | Value(Mindspore, GPU) | Value(Mindspore, Ascend 910) |
---|---|---|---|
Rank-1 | 66.05% | 69.57% | |
mAP | 72.98% | 73.13% |
Metric | Value(Pytorch) | Value(Mindspore, GPU, --trial 1) | Value(Mindspore, Ascend 910, -- trial 1) |
---|---|---|---|
Rank-1 | 73.20% | 77.91% | 77.28% |
mAP | 71.60% | 72.35% | 72.44% |
Metric | Value(Pytorch) | Value(Mindspore, GPU, --trial 1) | Value(Mindspore, Ascend 910, --trial 1) |
---|---|---|---|
Rank-1 | 71.80% | 76.50% | 76.07% |
mAP | 70.10% | 71.37% | 70.37% |
*Note: The aforementioned pytorch results can be seen in original pytorch repo.
-
In
utils.py
,IdentitySampler
used to sample different identities and images in both visible and infrared(thermal) modal, and we set random seed inIndentitySampler
. This randomness will affect both inference part intrain.py
and ineval.py
. Therefore small different rank-1 and mAP fluctuations(about 1%) between inference intrain.py
andeval.py
may be seen even on the same training results. -
When testing on RegDB dataset, there is a
--trial
argument specifying which id to be selected; different--trial
choosing may cause slight rank-1 mAP fluctuation.
Please kindly cite the original paper references in your publications if it helps your research:
@inproceedings{VariationalDistillation,
title={Farewell to Mutual Information Variational Distiilation for Cross-Modal Person Re-identification},
author={Xudong Tian and Zhizhong Zhang and Shaohui Lin and Yanyun Qu and Yuan Xie and Lizhuang Ma},
booktitle={Computer Vision and Pattern Recognition},
year={2021}
}
Please kindly reference the url of mindspore repository in your code if it helps your research and code.