This repository contains the source code under TensorFlow2.0 framework and models trained on ImageNet 2012 dataset for the following paper:
@InProceedings{Li_2018_CVPR,
author = {Li, Peihua and Xie, Jiangtao and Wang, Qilong and Gao, Zilin},
title = {Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization},
booktitle = { IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
This paper concerns an iterative matrix square root normalization network (called fast MPN-COV), which is very efficient, fit for large-scale datasets, as opposed to its predecessor (i.e., MPN-COV) published in ICCV17) that performs matrix power normalization by Eigen-decompositon. If you use the code, please cite this fast MPN-COV work and its predecessor (i.e., MPN-COV).
Network | Dim | Top1_err/Top5_err | Pre-trained models (tensorflow) |
|||
paper | reproduce | |||||
tensorflow | pytorch | GoogleDrive | BaiduDrive | |||
mpncov_resnet50 | 32K | 22.14/6.22 | 21.57/6.14 | 21.71/6.13 | GoogleDrive | BaiduDrive |
mpncov_resnet101 | 21.21/5.68 | 20.50/5.45 | 20.99/5.56 | GoogleDrive | BaiduDrive |
Backbone Model | Dim | CUB | Aircraft | Cars | |||
paper | reproduce (tensorflow) |
paper | reproduce (tensorflow) |
paper | reproduce (tensorflow) |
||
resnet50 | 32K | 88.1 | TODO | 90.0 | TODO | 92.8 | TODO |
resnet101 | 88.7 | 88.1 | 91.4 | 91.8 | 93.3 | 93.9 |
- Our method uses neither bounding boxes nor part annotations
- The reproduced results are obtained by simply finetuning our pre-trained fast MPN-COV-ResNet model with a small learning rate, which do not perform SVM as our paper described.
We implement our Fast MPN-COV (i.e., iSQRT-COV) meta-layer under Tensorflow2.0 package. We release two versions of code:
- The backpropagation of our meta-layer without using autograd package;
- The backpropagation of our meta-layer with using autograd package(TODO).
For making our Fast MPN-COV meta layer can be added in a network conveniently, we divide any network for three parts:
- features extractor;
- global image representation;
- classifier.
As such, we can arbitrarily combine a network with our Fast MPN-COV or some other global image representation methods (e.g.,Global average pooling, Bilinear pooling(TODO), Compact bilinear pooling(TODO), etc.)
- Install Tensorflow (2.0.0b0)
- type
git clone https://github.com/jiangtaoxie/fast-MPN-COV
- prepare the dataset as follows
.
├── train
│ ├── class1
│ │ ├── class1_001.jpg
│ │ ├── class1_002.jpg
| | └── ...
│ ├── class2
│ ├── class3
│ ├── ...
│ ├── ...
│ └── classN
└── val
├── class1
│ ├── class1_001.jpg
│ ├── class1_002.jpg
| └── ...
├── class2
├── class3
├── ...
├── ...
└── classN
cp ./trainingFromScratch/imagenet/imagenet_tfrecords.py ./
- modify the dataset path and run
python imagenet_tfrecords.py
to create tfrecord files - modify the parameters in train.sh
sh train.sh
- modify the parameters in finetune.sh
sh finetune.sh