use elementwise to optimize gelu forward implementation on GPU #38188

Zjq9409 · 2021-12-16T04:32:17Z

PR types

Performance optimization

PR changes

OPs

Describe

使用elementwise优化gelu算子GPU前向计算，前向算子性能数据如下：

paddle-bot-old · 2021-12-16T04:32:26Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

AshburnLee · 2021-12-17T05:03:48Z

paddle/fluid/operators/gelu_op.cu

+  std::vector<const framework::Tensor*> ins;
+  std::vector<framework::Tensor*> outs;
+  ins = {in};
+  outs = {out};


创建vector时就可以初始化

ZzSean · 2021-12-20T09:21:57Z

paddle/fluid/operators/gelu_op.cu

+};
+
+template <typename T>
+struct GeluNoApproximateFunctor {


名字改一下，跟上面的对应，改成without

ZzSean · 2021-12-20T09:28:52Z

paddle/fluid/operators/gelu_op.cu

+template <typename DeviceContext, typename T>
+typename std::enable_if<
+    std::is_same<DeviceContext, platform::CUDADeviceContext>::value>::type
+default_gelu_fw(const framework::ExecutionContext& ctx,


不需要重新写一个新的函数，直接特化一个CUDA版本的GeluKernel就可以了

ZzSean · 2021-12-20T09:32:31Z

paddle/fluid/operators/gelu_op.cu

+  using MT = typename details::MPTypeTrait<T>::Type;
+  inline HOSTDEVICE T operator()(T x) {
+    // this function is tanh approximation of gelu
+    MT mx = static_cast<MT>(x);


这里的命名可以参考activation_op.cu中的命名方式

ZzSean · 2021-12-21T04:25:48Z

paddle/fluid/operators/gelu_op.cu

+};
+
+template <typename DeviceContext, typename T>
+class GeluCUDAKernel : public framework::OpKernel<T> {


特化的话不用改名字，还用GeluKernel就好，DeviceContext这个模版参数用CUDADeviceContext就可以

ZzSean

LGTM

relu forward opt

5b67bf9

Zjq9409 changed the title ~~relu forward opt~~ Use elementwise to optimize gelu implementation on GPU Dec 16, 2021

Zjq9409 force-pushed the gelu_opt branch from 65ba81e to c040d49 Compare December 16, 2021 13:03

AshburnLee reviewed Dec 17, 2021

View reviewed changes

add gelu functor

875e6ac

Zjq9409 force-pushed the gelu_opt branch from c040d49 to 875e6ac Compare December 20, 2021 02:38

Zjq9409 changed the title ~~Use elementwise to optimize gelu implementation on GPU~~ Use elementwise to optimize gelu forward implementation on GPU Dec 20, 2021

ZzSean reviewed Dec 20, 2021

View reviewed changes

Zjq9409 changed the title ~~Use elementwise to optimize gelu forward implementation on GPU~~ use elementwise to optimize gelu forward implementation on GPU Dec 20, 2021

Zjq9409 force-pushed the gelu_opt branch 2 times, most recently from 107f8ed to dac385f Compare December 20, 2021 11:55

ZzSean reviewed Dec 21, 2021

View reviewed changes

optimize code

141c05a

Zjq9409 force-pushed the gelu_opt branch from dac385f to 141c05a Compare December 21, 2021 06:00

ZzSean approved these changes Dec 21, 2021

View reviewed changes

ZzSean merged commit aff4368 into PaddlePaddle:develop Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use elementwise to optimize gelu forward implementation on GPU #38188

use elementwise to optimize gelu forward implementation on GPU #38188

Zjq9409 commented Dec 16, 2021 •

edited

Loading

paddle-bot-old bot commented Dec 16, 2021

AshburnLee Dec 17, 2021

Zjq9409 Dec 20, 2021

ZzSean Dec 20, 2021

Zjq9409 Dec 20, 2021

ZzSean Dec 20, 2021

Zjq9409 Dec 20, 2021

ZzSean Dec 20, 2021

Zjq9409 Dec 20, 2021

ZzSean Dec 21, 2021

Zjq9409 Dec 21, 2021

ZzSean left a comment

use elementwise to optimize gelu forward implementation on GPU #38188

use elementwise to optimize gelu forward implementation on GPU #38188

Conversation

Zjq9409 commented Dec 16, 2021 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Dec 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZzSean left a comment

Choose a reason for hiding this comment

Zjq9409 commented Dec 16, 2021 •

edited

Loading