Merge pull request #58 from JDAI-CV/update_readme

Update README
JDAI-CV · Oct 15, 2019 · cf66a2c · cf66a2c
2 parents f06f1bc + 965ae7a
commit cf66a2c
Show file tree

Hide file tree

Showing 5 changed files with 77 additions and 34 deletions.
diff --git a/README.md b/README.md
@@ -10,8 +10,6 @@
 
 [[English](README.md)] [[Chinese/中文](README_CN.md)]
 
-Join chat at [Gitter (English)](https://gitter.im/dabnn/dabnn) or QQ Group (Chinese, 1021964010, answer: nndab)
-
 Our ACM MM paper: https://arxiv.org/abs/1908.05858
 
 ## Introduction
@@ -20,43 +18,19 @@ Binary neural networks (BNNs) have great potential on edge devices since they re
 
 To our best knowledge, dabnn is the first highly-optimized binary neural networks inference framework for mobile platform. We implemented binary convolutions with ARM assembly. On Google Pixel 1, our dabnn is as **800%~2400% faster** as [BMXNet](https://github.com/hpi-xnor/BMXNet) (the only one open-sourced BNN inference framework except dabnn to our best knowledge) on a single binary convolution, and as about **700% faster** as it on binarized ResNet-18.
 
-## Benchmark and Comparison
-
-Benchmark result on Google Pixel 1 (single thread):
-
-```
-2019-05-06 10:36:48
-Running data/local/tmp/dabnn_benchmark
-Run on (4 X 1593.6 MHz CPU s)
-***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------------
-Benchmark                             Time           CPU Iterations
---------------------------------------------------------------------
-dabnn_5x5_256                   3661928 ns    3638192 ns        191     <--- input: 14*14*256, kernel: 256*5*5*256, output: 14*14*256, padding: 2
-dabnn_3x3_64                    1306391 ns    1281553 ns        546     <--- input: 56*56*64,  kernel: 64*3*3*64, output: 56*56*64, padding: 1
-dabnn_3x3_128                    958388 ns     954754 ns        735     <--- input: 28*28*128, kernel: 128*3*3*128, output: 28*28*128, padding: 1
-dabnn_3x3_256                    975123 ns     969810 ns        691     <--- input: 14*14*256, kernel: 256*3*3*256, output: 14*14*256, padding: 1
-dabnn_3x3_256_s2                 268310 ns     267712 ns       2618     <--- input: 14*14*256, kernel: 256*3*3*256, output: 7*7*256, padding: 1, stride: 2
-dabnn_3x3_512                   1281832 ns    1253921 ns        588     <--- input:  7* 7*512, kernel: 512*3*3*512, output:  7* 7*512, padding: 1
-dabnn_bireal18_imagenet        61920154 ns   61339185 ns         10     <--- Bi-Real Net 18, 56.4% top-1 on ImageNet
-dabnn_bireal18_imagenet_stem   43294019 ns   41401923 ns         14     <--- Bi-Real Net 18 with stem module (The network structure is described in detail in [our paper](https://arxiv.org/abs/1908.05858)), 56.4% top-1 on ImageNet
-```
-
-The following is the comparison between our dabnn and [Caffe](http://caffe.berkeleyvision.org) (full precision), [TensorFlow Lite](https://www.tensorflow.org/lite) (full precision) and [BMXNet](https://github.com/hpi-xnor/BMXNet) (binary). We surprisingly observe that BMXNet is even slower than the full precision TensorFlow Lite. It suggests that the potential of binary neural networks is far from exploited until our dabnn is published.
-
-![Comparison](images/comparison_en.png)
+![Comparison](/images/comparison_en.png)
 
 ## Build
 
 We provide pre-built onnx2bnn and also dabnn Android package. However, you need to build it if you want to deploy BNNs on non-Android ARM devices.
 
-We use CMake build system like most C++ projects. Check out [docs/build.md](docs/build.md) for the detail instructions.
+We use CMake build system like most C++ projects. Check out [docs/build.md](docs/build.md) for the detailed instructions.
 
 ## Convert ONNX Model
 
 We provide a conversion tool, named onnx2bnn, to convert an ONNX model to a dabnn model. We provide onnx2bnn pre-built binaries for all platforms in [GitHub Releases](https://github.com/JDAI-CV/dabnn/releases). For Linux users, the onnx2bnn pre-built binary is [AppImage](https://appimage.org) format, see https://appimage.org for details.
 
-Note: Binary convolution is a custom operator, so whether the ONNX model is dabnn-comptabile heavily depends on the implementation of the binary convolution in the training code. Please check out [our wiki](https://github.com/JDAI-CV/dabnn/wiki/Train,-export-and-convert-a-dabnn-model) for the further information.
+Note: Binary convolution is a custom operator, so whether the ONNX model is dabnn-comptabile heavily depends on the implementation of the binary convolution in the training code. **Please read the [documentation about model conversion](/docs/model_conversion.md) carefully.**
 
 After conversion, the generated dabnn model can be deployed on ARM devices (e.g., mobile phones and embedded devices). For Android developer, we have provided Android AAR package and published it on [jcenter](https://bintray.com/daquexian566/maven/dabnn/_latestVersion), for the usage please check out [example project](https://github.com/JDAI-CV/dabnn-example).
 

diff --git a/README_CN.md b/README_CN.md
@@ -58,7 +58,7 @@ dabnn_bireal18_imagenet_stem   43294019 ns   41401923 ns         14     <--- 带
 
 我们提供模型转换工具 onnx2bnn 将 ONNX 模型转换为 dabnn 格式的模型。在 [GitHub Releases](https://github.com/JDAI-CV/dabnn/releases) 里有各个平台的 onnx2bnn 预编译二进制文件，可以直接下载运行。Linux 用户我们提供的是 AppImage 格式的二进制文件，AppImage 的使用方法和其它相关信息请参考 https://appimage.org/。
 
-注意：因为二值卷积是一种自定义操作，所以 ONNX 模型是否与 dabnn 兼容极大程度上依赖于训练代码中二值卷积的实现，在 [wiki](https://github.com/JDAI-CV/dabnn/wiki/Train,-export-and-convert-a-dabnn-model) 中有详细的进一步描述。
+注意：因为二值卷积是一种自定义操作，所以 ONNX 模型是否与 dabnn 兼容极大程度上依赖于训练代码中二值卷积的实现，请仔细阅读[相关文档](/docs/model_conversion.md)。
 
 转换完成后得到的 dabnn 模型就可以在 ARM 设备（例如手机和嵌入式设备）上使用。对 Android 开发者我们已经把 Android AAR 包上传到了 [jcenter](https://bintray.com/daquexian566/maven/dabnn/_latestVersion)，使用方法请看[示例工程](https://github.com/JDAI-CV/dabnn-example)。
 

diff --git a/docs/benchmark_and_comparison.md b/docs/benchmark_and_comparison.md
@@ -0,0 +1,27 @@
+## Benchmark and Comparison
+
+Benchmark result on Google Pixel 1 (single thread):
+
+```
+2019-05-06 10:36:48
+Running data/local/tmp/dabnn_benchmark
+Run on (4 X 1593.6 MHz CPU s)
+***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
+--------------------------------------------------------------------
+Benchmark                             Time           CPU Iterations
+--------------------------------------------------------------------
+dabnn_5x5_256                   3661928 ns    3638192 ns        191     <--- input: 14*14*256, kernel: 256*5*5*256, output: 14*14*256, padding: 2
+dabnn_3x3_64                    1306391 ns    1281553 ns        546     <--- input: 56*56*64,  kernel: 64*3*3*64, output: 56*56*64, padding: 1
+dabnn_3x3_128                    958388 ns     954754 ns        735     <--- input: 28*28*128, kernel: 128*3*3*128, output: 28*28*128, padding: 1
+dabnn_3x3_256                    975123 ns     969810 ns        691     <--- input: 14*14*256, kernel: 256*3*3*256, output: 14*14*256, padding: 1
+dabnn_3x3_256_s2                 268310 ns     267712 ns       2618     <--- input: 14*14*256, kernel: 256*3*3*256, output: 7*7*256, padding: 1, stride: 2
+dabnn_3x3_512                   1281832 ns    1253921 ns        588     <--- input:  7* 7*512, kernel: 512*3*3*512, output:  7* 7*512, padding: 1
+dabnn_bireal18_imagenet        61920154 ns   61339185 ns         10     <--- Bi-Real Net 18, 56.4% top-1 on ImageNet
+dabnn_bireal18_imagenet_stem   43294019 ns   41401923 ns         14     <--- Bi-Real Net 18 with stem module (The network structure is described in detail in [our paper](https://arxiv.org/abs/1908.05858)), 56.4% top-1 on ImageNet
+```
+
+The following is the comparison between our dabnn and [Caffe](http://caffe.berkeleyvision.org) (full precision), [TensorFlow Lite](https://www.tensorflow.org/lite) (full precision) and [BMXNet](https://github.com/hpi-xnor/BMXNet) (binary). We surprisingly observe that BMXNet is even slower than the full precision TensorFlow Lite. It suggests that the potential of binary neural networks is far from exploited until our dabnn is published.
+
+![Comparison](/images/comparison_en.png)
+
+
diff --git a/docs/model_conversion.md b/docs/model_conversion.md
@@ -0,0 +1,41 @@
+## If you want to benchmark an existing full-precision network structure
+
+If you just want to benchmark the latency of a BNN instead of deploying it, you can determine which convolutions in the input ONNX model are binary by passing "--binary-list filename" command-line argument. Each line of the file is the **output name** of a convolution.
+
+For example, you have a full-precision model named "model.onnx". In the model, you want three convolutions, whose outputs are "34", "36", "55" respectively, to be binary convolutions, and test how fast the model with three binary convolutions will be. In this case, you should first create a text file with
+
+> 34
+>
+> 36
+>
+> 55
+
+After creating the text file (Let's assume the file is named "my_binary_convs"), you can convert the model by
+
+```bash
+./onnx2bnn model.onnx model.dab --binary-list my_binary_convs
+```
+
+Once the command finishes, you will get a BNN model named model.dab.
+
+## If you want to train and export a dabnn-compatible ONNX model
+
+If you want to train and deploy a BNN on real device, the following instructions are what you needed.
+
+Binary convolutions are not supported natively by training frameworks (e.g., TensorFlow, PyTorch, MXNet). To implement correct and dabnn-compatible binary convolutions by self, there is something needed attention:
+
+1. The input of binary convolutions should only be +1/-1, but the padding value of convolution is 0.
+
+2. PyTorch doesn't support export ONNX sign operator until PyTorch 1.2.
+
+Therefore, we provide a ["standard" PyTorch implementation](https://gist.github.com/daquexian/7db1e7f1e0a92ab13ac1ad028233a9eb) which is compatible with dabnn and produces a correct result. The implementations TensorFlow, MXNet and other training frameworks should be similar. 
+
+#### How does dabnn recognize binary convolutions in ONNX model
+
+The converter `onnx2bnn` has three mode in terms of how it recognizes binary convolutions:
+
+* Aggressive (default). In this mode, onnx2bnn will mark all convolutions whose weights consist of only +1 or -1 as binary convolutions. The aggressive mode is for the existing BNN models which do not have the correct padding value (-1 rather than 0). Note: The output of the generated dabnn model is different from that of the ONNX model since the padding value is 0 instead of -1.
+* Moderate. This mode is for our "standard" implementation -- A Conv operator with binary weight and following a -1 Pad operator.
+* Strict. In this mode, onnx2bnn only recognizes the following natural and correct "pattern" of binary convolutions: A Conv operator, whose input is got from a Sign op and a Pad op (the order doesn't matter), and weight is got from a Sign op.
+
+For now "Aggressive" is the default mode. To enable moderate or strict mode, pass "--moderate" or "--strict" command-line argument to onnx2bnn.
diff --git a/tools/onnx2bnn/onnx2bnn.cpp b/tools/onnx2bnn/onnx2bnn.cpp
@@ -17,10 +17,11 @@ using std::vector;
 
 void usage(const std::string &filename) {
     std::cout << "Usage:" << std::endl;
-    std::cout << "  " << filename
-              << " onnx_model output_filename [ --strict | --moderate | "
-                 "--aggressive ] [--binary-list] [--verbose]"
-              << std::endl;
+    std::cout
+        << "  " << filename
+        << " onnx_model output_filename [ --strict | --moderate | "
+           "--aggressive ] [--binary-list binary_list_filename] [--verbose]"
+        << std::endl;
     std::cout << std::endl;
     std::cout << "Options:" << std::endl;
     std::cout