MKLDNN conv2d kernel added #8451

pzelazko-intel · 2018-02-15T16:59:42Z

MKLDNN conv2d and pool2d OP kernels can be enabled with use_mkldnn OP flag - just like currenly present use_cudnn flag. It's set to True by default. use_cudnn flag has figher priority.

Beside unit tests, we validated these kernels by running training and interference on MNIST dataset and comparing results with caffe library.

CLAassistant · 2018-02-15T16:59:49Z

All committers have signed the CLA.

luotao1 · 2018-02-26T07:12:39Z

Can you divide this PR into three small PRs?

typo fix - TransFromNeeded -> TransformNeeded and
MKLDNNDeviceContext changes: e0531da and 7a358fa
MKLDNN conv2d OP kernels and unit test added
MKLDNN pool2d OP kernels and unit test added

jacquesqiao · 2018-02-26T07:17:25Z

paddle/fluid/framework/op_registry.h

@@ -236,11 +236,11 @@ class OpKernelRegistrar : public Registrar {

 #define USE_CUDA_ONLY_OP(op_type) \
  USE_OP_ITSELF(op_type);         \
-  USE_OP_DEVICE_KERNEL(op_type, CUDA)
+  USE_OP_DEVICE_KERNEL(op_type, CUDA);


should not add ; at the end, we want user use a macro like

SOME_MACRO();

luotao1 · 2018-02-26T12:49:41Z

paddle/fluid/operators/CMakeLists.txt

-file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(conv2d, CUDNN);\n")
-file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(pool2d, CUDNN);\n")
-file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(conv2d_transpose, CUDNN);\n")
+    op_library(edit_distance_op SRCS edit_distance_op.cc edit_distance_op.cu DEPS math_function)


We will pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically in #8590, in order to make operators/CMakelists.txt much cleaner.

Then, only one sentence op_library(pool_op DEPS pooling) will pybind CPU/CUDA/CUDNN/MKLDNN all device kernel.

Thus:

how about use conv_mkldnn_op.cc likes conv_cudnn_op.cu.cc instead of conv_op.mn.cc?

refine these codes after pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically #8590 finished?

@luotao1 Do you want my changes to be merged after #8590 is finished?

If #8590 doesn't be finished before your small PRs, you can merge your changes at first.

#8590 is finished and merged now.

tensor-tang

First of all, as synced with @pzelazko-intel, we will break this PR into some smaller ones.

As for current code, we also had a discussion.

The most important information is that the current implementation may not be the best efficient one, since the format is fixed as nchw and the transform functions is still under developing.

If anything is missing, @pzelazko-intel please point out.

tensor-tang · 2018-03-01T06:11:13Z

paddle/fluid/operators/conv_op.cc

-  } else {
-    library_ = framework::LibraryType::kPlain;
+  } else if (CanMKLDNNBeUsed(ctx)) {
+    library_ = framework::LibraryType::kMKLDNN;
  }

  std::string data_format = ctx.Attr<std::string>("data_format");


Here we only make MKLDNN library enabled.

As synced with @pzelazko-intel, we would enable MKLDNN layout next time.
Then we would considerate transform function as well.

Could you add "TODO" in codes for reminding? Including:

enable MKLDNN layout

enable groups

something more.

Besides, could you not cover the previous commit next time, since we could not find the difference after your updating?

OK, I'm going to add TODOs.
I've covered previous commits, because I wanted commit history to be clear.
Would I have opportunity to squash commits when merging?

Yes, we can squash commits when merging your PR.

tensor-tang · 2018-03-01T06:17:51Z