Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Softmax MKLDNN FLUID operator #9214

Merged

Conversation

jczaja
Copy link
Contributor

@jczaja jczaja commented Mar 19, 2018

This PR provides MKLDNN based Softmax op implementation.

Performance and testing:

On tested models softmax MKLDNN op is roughly ~10x faster than Plain CPU version.
RNN Search (https://github.com/dzhwinter/benchmark/blob/master/fluid/machine_translation.py)
executes training in ~90% of Plain CPU time. RNN search does converge with Softmax MKLDNN op used.

Notes

  • It was needed to update cross_entropy grad op with code preventing -INF results in a similar way as it was done in cross_entropy op.

@mrysztow mrysztow added the Intel label Mar 19, 2018
@jczaja jczaja force-pushed the prv-softmax-mkldnn-operator-PR branch from 083b8bd to 29b68ef Compare March 19, 2018 15:55
@kbinias kbinias requested a review from luotao1 March 20, 2018 08:45
@luotao1
Copy link
Contributor

luotao1 commented Mar 20, 2018

LGTM. @tensor-tang Could you help review the softmax_mkldnn_op.cc?

using mkldnn::stream;

template <typename T>
class SoftmaxMkldnnKernel : public paddle::framework::OpKernel<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SoftmaxMkldnnKernel ==> SoftmaxMKLDNNKernel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@@ -81,6 +81,7 @@ def fc(input,
num_flatten_dims=1,
param_attr=None,
bias_attr=None,
use_mkldnn=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why softmax PR would change fc?

attrs={
"x_num_col_dims": num_flatten_dims,
"y_num_col_dims": 1,
'use_mkldnn': use_mkldnn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save as above, this should be added in FC PR, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tensor-tang You are correct, but without those changes to FC Softmax MKLDNN would remain unused code. Till FC of MKLDNN is not ready (it is under development) then there is no other option for users to take advantage of softmax mkldnn operator as it is usually used as activation of FC layer. Do you suggest to remove use_mkldnn attribute/param from FC ?

Copy link
Contributor Author

@jczaja jczaja Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tensor-tang, @luotao1 Should I remove use_mkldnn attrbute from FC , in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be some other ways to use mkldnn softmax without change fc. But since we will implement MKLDNN FC soon, I think it's fine, you do not need revert back now.

@jczaja jczaja force-pushed the prv-softmax-mkldnn-operator-PR branch 2 times, most recently from 9df2a09 to 228e0dc Compare March 20, 2018 16:14
removed diagnostic

- Added Unit tests for Softmax MKLDNN Forward

Added fix for div by 0 to happen in cross_entropy backward

Conflicts:
	paddle/fluid/operators/CMakeLists.txt

- Cosmetic fixes to SoftMax MKLDNN fluid operator

Added misssing softmax fluid operator file

Disabled MKLDNN softmax operator by default

Fix to softmax op unittest merge

clang_formater fixes

clang_formatter fixes

- Name changing of softmax mkldnn operator to maintin consistency
  across codebase

- updated comment

fix to comment
@jczaja jczaja force-pushed the prv-softmax-mkldnn-operator-PR branch from 228e0dc to 3b95b55 Compare March 21, 2018 10:13
attrs={
"x_num_col_dims": num_flatten_dims,
"y_num_col_dims": 1,
'use_mkldnn': use_mkldnn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be some other ways to use mkldnn softmax without change fc. But since we will implement MKLDNN FC soon, I think it's fine, you do not need revert back now.

@tensor-tang tensor-tang merged commit 7260e3a into PaddlePaddle:develop Mar 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants