-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Topi] Tensorcore support for Conv3D #5284
Conversation
@Shawn-Inspur, @Laurawly could you take a look at this PR? |
also cc @icemelon9 since it is related to stragey |
Anyone know what the deal with the CI failure is? Seems to work fine on my local branch and has nothing to do with this PR.
|
The sphinx error is a known flaky case that we should look into, please push to retrigger |
@yangjunpro @minminsun Please also have a look, because you have done tensorcore support ever. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. just some minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
Just two small cents.
The schedule looks a little bit complicated.
Have your considered enriching the Auto TensorCore codegen pass to automate the TensorCore schedule generation process?
The previous work of Auto TensorCore codegen only covers GEMM and it does require additional work to add conv support. However with the addition of conv support into Auto TensorCore codegen pass, it may make the convolution tensorcore optimization more generic.
@yangjunpro, funnily enough I asked nearly the same question to the authors of the conv2d tensorcore schedule. You can read their answer here. The quick take is that it does not support conv (like you mentioned) but more importantly that it causes significant performance regression compared to a bespoke approach like this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @jwfromm @Laurawly @Shawn-Inspur @yangjunpro @icemelon9 ! |
* one weird trick. * Added schedule knob for different workloads. * Initial conv3d tensorcore working. * Added conv3d tensorcore strategy. * Added layout conversion to tensorcore friendly format for conv2d and conv3d. * Add target name check. * Fixed bad names and depthwise check. * Removed duplicated attribute assignment.
* one weird trick. * Added schedule knob for different workloads. * Initial conv3d tensorcore working. * Added conv3d tensorcore strategy. * Added layout conversion to tensorcore friendly format for conv2d and conv3d. * Add target name check. * Fixed bad names and depthwise check. * Removed duplicated attribute assignment.
* one weird trick. * Added schedule knob for different workloads. * Initial conv3d tensorcore working. * Added conv3d tensorcore strategy. * Added layout conversion to tensorcore friendly format for conv2d and conv3d. * Add target name check. * Fixed bad names and depthwise check. * Removed duplicated attribute assignment.
This PR is a pretty direct port of the conv2d tensorcore schedules introduced in PR #5099 to conv3d. In my early testing I've found this new schedule to be up to 10X faster than the default conv3d schedule. I also snuck one little adjustment into the conv3d winograd schedule that helps for smaller workloads. Given that tensorcore support is currently only for
NHWC
andNDHWC
layouts, I've also added support for converting to these layouts using theConvertLayout
pass.