[CodeGen][CUDA] Vectorization for intrinsics #5101

wpan11nv · 2020-03-19T18:21:54Z

This allows to emit vectorized loads/stores
for CUDA math intrinsics.
Fixed a few intrinsics that should be lowered as
CUDAMath not CUDAFastMath ones.

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

wpan11nv · 2020-03-19T18:23:17Z

Fixing missing features exposed by #4968

jwfromm · 2020-03-19T19:29:03Z

This seems like a great change! Have you done any tests on how it affects performance? I'd love to know how much this speeds things up.

wpan11nv · 2020-03-19T20:53:06Z

This seems like a great change! Have you done any tests on how it affects performance? I'd love to know how much this speeds things up.

I measured vectorization benefits on a vector add micro-benchmark in this PR #4968 The speedup could be as high as 20+%. This is a missing feature to enable PR4968.

wpan11nv · 2020-03-20T16:14:27Z

@vinx13 @masahi could you please help review this PR?

src/target/source/codegen_cuda.cc

vinx13 · 2020-03-20T16:55:38Z

src/target/source/codegen_cuda.cc

+    //
+    // Emit an unsupported vector call
+    //
+    // v = intrin_f((float4*)A[0], (float4*)B[0])


do you mean intrin_f(((float4*)A)[0], ((float4*)B)[0])?

that is the CallNode representation, not supported in CUDA. We are going to emit a few scalar calls instead here.

- This allows to emit vectorized loads/stores for CUDA math intrinsics. - A few intrinsics should be lowered as CUDAMath not CUDAFastMath ones. - Fixed the code block identation.

vinx13 · 2020-03-22T19:22:46Z

Thanks @wpan11nv this is merged

- This allows to emit vectorized loads/stores for CUDA math intrinsics. - A few intrinsics should be lowered as CUDAMath not CUDAFastMath ones. - Fixed the code block identation.

wpan11nv force-pushed the intrinsic_vectorize branch from b3be9bd to 6f21588 Compare March 19, 2020 18:47

wpan11nv force-pushed the intrinsic_vectorize branch from 6f21588 to 56a61d1 Compare March 20, 2020 00:10

wpan11nv force-pushed the intrinsic_vectorize branch from 56a61d1 to e1563b4 Compare March 20, 2020 16:19

tqchen assigned vinx13 Mar 20, 2020

tqchen added the status: need review label Mar 20, 2020

vinx13 reviewed Mar 20, 2020

View reviewed changes

[CodeGen][CUDA] Vectorization for intrinsics

9937bea

- This allows to emit vectorized loads/stores for CUDA math intrinsics. - A few intrinsics should be lowered as CUDAMath not CUDAFastMath ones. - Fixed the code block identation.

wpan11nv force-pushed the intrinsic_vectorize branch from e1563b4 to 9937bea Compare March 20, 2020 17:04

vinx13 approved these changes Mar 22, 2020

View reviewed changes

vinx13 merged commit 05b0f7e into apache:master Mar 22, 2020

vinx13 added status: accepted and removed status: need review labels Mar 22, 2020

wpan11nv deleted the intrinsic_vectorize branch March 23, 2020 03:44

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGen][CUDA] Vectorization for intrinsics #5101

[CodeGen][CUDA] Vectorization for intrinsics #5101

wpan11nv commented Mar 19, 2020

wpan11nv commented Mar 19, 2020 •

edited

Loading

jwfromm commented Mar 19, 2020

wpan11nv commented Mar 19, 2020 •

edited

Loading

wpan11nv commented Mar 20, 2020

vinx13 Mar 20, 2020

wpan11nv Mar 20, 2020 •

edited

Loading

vinx13 commented Mar 22, 2020

[CodeGen][CUDA] Vectorization for intrinsics #5101

[CodeGen][CUDA] Vectorization for intrinsics #5101

Conversation

wpan11nv commented Mar 19, 2020

wpan11nv commented Mar 19, 2020 • edited Loading

jwfromm commented Mar 19, 2020

wpan11nv commented Mar 19, 2020 • edited Loading

wpan11nv commented Mar 20, 2020

vinx13 Mar 20, 2020

Choose a reason for hiding this comment

wpan11nv Mar 20, 2020 • edited Loading

Choose a reason for hiding this comment

vinx13 commented Mar 22, 2020

wpan11nv commented Mar 19, 2020 •

edited

Loading

wpan11nv commented Mar 19, 2020 •

edited

Loading

wpan11nv Mar 20, 2020 •

edited

Loading