Skip to content

Latest commit

 

History

History
28 lines (23 loc) · 1.61 KB

README.md

File metadata and controls

28 lines (23 loc) · 1.61 KB

Paddle Benchmark

Inference benchmark of deep learning models implemented by paddlepaddle.

Environment

  • MI 5, Android 7.0, Snapdragon 820 1.8GHz
  • android-ndk-r13b
    • gcc version 4.9.x 20150123 (prerelease) (GCC)
    • Android clang version 3.8.256229 (based on LLVM 3.8.256229)

Mobilenet

Benchmark for Mobilenet inference(input image 3x224x224).

Currently, on MI 5 phones, single-threaded inference takes 122.607ms and takes up 48M of system memory.

version times(ms) mem(MB) size(KB) optimization(accelerate)
d2258a4 321.682 - - base
d2258a4 225.044 - - merge bn(30%)
b45d020 148.201 - - depthwise convolution(34.1%)
0146e8b 127.032 - - clang compile(14.3%)
d59295f 122.607 48 4306 -> 1431 neon::relu(3.5%)
  • The convolution layer of the Base version is achieved by im2col + gemm way.
  • The merge bn optimization is merge the parameters of batch normalization layer's into the parameters of convolution layer.
  • The depthwise convolution is a depthwise convolution optimization base on arm neon intrinsics.
  • The clang compile is better than gcc compile.
  • The test method of mem(MB) is running the paddle inference program, and use the free command access the changes of memory usage in the system.
  • The previous value in size (KB) column is the size of the paddle inference.so, and the latter is the size after zip compressed.