UPDATE: pytorch implementation released. But I am not sure whether it works with multi-processing distributed data parallel. I only tested with a single GPU and multi-GPU data parallel. The Tensorflow version still works, but I would not suggest you read it.
This repository contains the codes for the following ICML-2019 paper
Approximated Oracle Filter Pruning for Destructive CNN Width Optimization.
Citation:
@inproceedings{ding2019approximated,
title={Approximated Oracle Filter Pruning for Destructive CNN Width Optimization},
author={Ding, Xiaohan and Ding, Guiguang and Guo, Yuchen and Han, Jungong and Yan, Chenggang},
booktitle={International Conference on Machine Learning},
pages={1607--1616},
year={2019}
}
This demo will show you how to
- Reproduce 65% pruning ratio of VGG on CIFAR-10.
- Reproduce 50% pruning ratio of ResNet-56 on CIFAR-10.
About the environment:
- We used torch==1.3.0, torchvision==0.4.1, CUDA==10.2, NVIDIA driver version==440.82, tensorboard==1.11.0 on a machine with 2080Ti GPUs.
- Our method does not rely on any new or deprecated features of any libraries, so there is no need to make an identical environment.
- If you get any errors regarding tensorboard or tensorflow, you may simply delete the code related to tensorboard or SummaryWriter.
It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i.e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices. Oracle Pruning is designed to remove the unimportant filters from a well-trained CNN, which estimates the filters’ importance by ablating them in turn and evaluating the model, thus delivers high accuracy but suffers from intolerable time complexity, and requires a given resulting width but cannot automatically find it. To address these problems, we propose Approximated Oracle Filter Pruning (AOFP), which keeps searching for the least important filters in a binary search manner, makes pruning attempts by masking out filters randomly, accumulates the resulting errors, and finetunes the model via a multi-path framework. As AOFP enables simultaneous pruning on multiple layers, we can prune an existing very deep CNN with acceptable time cost, negligible accuracy drop, and no heuristic knowledge, or re-design a model which exerts higher accuracy and faster inference.
-
Enter this directory.
-
Make a soft link to your CIFAR-10 directory. If the dataset is not found in the directory, it will be automatically downloaded.
ln -s YOUR_PATH_TO_CIFAR cifar10_data
- Set the environment variables.
export PYTHONPATH=.
export CUDA_VISIBLE_DEVICES=0
- Train the base model.
python train_base_model.py -a vc
- Run AOFP. The pruned weights will be saved to "aofp_models/vc_train/finish_pruned.hdf5" and automatically tested.
python aofp/do_aofp.py -a vc
- Show the name and shape of weights in the pruned model.
python display_hdf5.py aofp_models/vc_train/finish_pruned.hdf5
-
Enter this directory.
-
Make a soft link to your CIFAR-10 directory. If the dataset is not found in the directory, it will be automatically downloaded.
ln -s YOUR_PATH_TO_CIFAR cifar10_data
- Set the environment variables.
export PYTHONPATH=.
export CUDA_VISIBLE_DEVICES=0
- Train the base model.
python train_base_model.py -a src56
- Run AOFP. The pruned weights will be saved to "aofp_models/src56_train/finish_pruned.hdf5" and automatically tested.
python aofp/do_aofp.py -a src56
- Show the name and shape of weights in the pruned model.
python display_hdf5.py aofp_models/src56_train/finish_pruned.hdf5
[email protected] (The original Tsinghua mailbox [email protected] will expire in several months)
Google Scholar Profile: https://scholar.google.com/citations?user=CIjw0KoAAAAJ&hl=en
Homepage: https://dingxiaohan.xyz/
My open-sourced papers and repos:
The Structural Re-parameterization Universe:
-
RepLKNet (CVPR 2022) Powerful efficient architecture with very large kernels (31x31) and guidelines for using large kernels in model CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
code. -
RepOptimizer uses Gradient Re-parameterization to train powerful models efficiently. The training-time model is as simple as the inference-time. It also addresses the problem of quantization.
Re-parameterizing Your Optimizers rather than Architectures
code. -
RepVGG (CVPR 2021) A super simple and powerful VGG-style ConvNet architecture. Up to 84.16% ImageNet top-1 accuracy!
RepVGG: Making VGG-style ConvNets Great Again
code. -
RepMLP (CVPR 2022) MLP-style building block and Architecture
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
code. -
ResRep (ICCV 2021) State-of-the-art channel pruning (Res50, 55% FLOPs reduction, 76.15% acc)
ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting
code. -
ACB (ICCV 2019) is a CNN component without any inference-time costs. The first work of our Structural Re-parameterization Universe.
ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks.
code. -
DBB (CVPR 2021) is a CNN component with higher performance than ACB and still no inference-time costs. Sometimes I call it ACNet v2 because "DBB" is 2 bits larger than "ACB" in ASCII (lol).
Diverse Branch Block: Building a Convolution as an Inception-like Unit
code.
Model compression and acceleration:
-
(CVPR 2019) Channel pruning: Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure
code -
(ICML 2019) Channel pruning: Approximated Oracle Filter Pruning for Destructive CNN Width Optimization
code -
(NeurIPS 2019) Unstructured pruning: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
code