PPDetection-YOLOv3 模型压缩方案

本方案使用蒸馏和剪枝两种方法结合对YOLOv3进行压缩。

示例结果

骨架网络	数据集	剪裁策略	GFLOPs	模型体积(MB)	输入尺寸	Tesla P4	麒麟970	高通835	高通855
MobileNetV1	VOC	baseline	20.20	93.37	608	16.556	748.404	734.970	289.878
MobileNetV1	VOC	baseline	9.46	93.37	416	9.031	371.214	349.065	140.877
MobileNetV1	VOC	baseline	5.60	93.37	320	6.235	221.705	200.498	80.515
MobileNetV1	VOC	r578	6.15(-69.57%)	30.81(-67.00%)	608	10.064(-39.21%)	314.531(-57.97%)	323.537(-55.98%)	123.414(-57.43%)
MobileNetV1	VOC	r578	2.88(-69.57%)	30.81(-67.00%)	416	5.478(-39.34%)	151.562(-59.17%)	146.014(-58.17%)	56.420(-59.95%)
MobileNetV1	VOC	r578	1.70(-69.57%)	30.81(-67.00%)	320	3.880(-37.77%)	91.132(-58.90%)	87.440(-56.39%)	31.470(-60.91%)

在使用r578剪裁策略下，YOLOv3-MobileNetV1模型减少了69.57%的FLOPs，输入图像尺寸为608时在单卡Tesla P4(TensorRT)推理时间减少39.21%，在麒麟970/高通835/高通855上推理时延分别减少57.97%, 55.98%和57.43%