-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction #2664
Conversation
topi/python/topi/nn/conv2d.py
Outdated
raise ValueError("missing register for topi.nn.conv2d_NCHWc") | ||
# layout and out_layout are not used here, | ||
# we keep them for debug convenience when dumping autotvm workload | ||
HPAD, WPAD = padding if isinstance(padding, (tuple, list)) else (padding, padding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use pad_top, pad_left, pad_down, pad_right = get_pad_tuple( padding, (dilated_kernel_h, dilated_kernel_w))
then HPAD = pad_top + pad_down
and WPAD = pad_left + pad_right
? This would make this RFC: #2682 easier.
This looks great, thank you. I've had to do something similar in previous branches, this is really helpful to have this functionality upstreamed :) |
@eqy can you on review comment by @FrozenGene . @ajtulloch would be nice if you can https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly |
@@ -40,6 +40,18 @@ def _create_tuning_space(cfg, data, kernel, strides, padding, dilation, layout): | |||
if layout == 'NCHW': | |||
n, ic, h, w = dshape | |||
oc, _, kh, kw = kshape | |||
elif 'NCHW' in layout and 'c' in layout: | |||
n, ic_chunk, h, w, ic_bn = dshape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this do a regex match against "NCHW(\d+)c"
? This seems unnecessarily fragile (though inline with existing style), but will fail for e.g. NCHW4h4w8c which is something I've been experimenting with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with regex, but this doesn't seem to match "NCHW4h4w8c" unless I change the expression to something like NCHW.+(\d+)c
?
topi/python/topi/x86/conv2d.py
Outdated
@@ -48,6 +60,7 @@ def _create_tuning_space(cfg, data, kernel, strides, padding, dilation, layout): | |||
sh, sw = strides if isinstance(strides, (tuple, list)) else (strides, strides) | |||
oh = (h - kh + 2 * ph) // sh + 1 | |||
ow = (w - kw + 2 * pw) // sw + 1 | |||
cfg.add_flop(2*oh*ow*kh*kw*oc*ic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be necessary after #2776, maybe add a TODO to remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, two minor nits.
Sorry---been tied up with a few odds and ends, I'll make the changes tomorrow. |
tune_relay_x86 tutorial needs to be updated as well? |
It is not broken by this PR, right--iirc it uses |
@eqy can you followup on this ? perhaps do a rebase push to retrigger CI. @merrymercy please take a look and merge it in if it is OK |
looks like it breaks one of the GPU tests, will look into this |
84b582b
to
f662e59
Compare
Thanks, @eqy @ajtulloch , this is merged |
…pache#2664) [AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction
…pache#2664) [AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction
…pache#2664) [AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction
…pache#2664) [AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction
…pache#2664) [AUTOTVM][TOPI] Port x86 NCHWc to AutoTVM for Task Extraction
The intent for this PR is to add AutoTVM task extraction to graphs with NCHWc ops directly without needing to convert from NCHW to NCHWc manually. However, the current style of search space definition in x86 NCHWc is only suitable when we do exhaustive search. This is because the search space definition seems to override any information from graph-level optimization about data layout options, and just tries every option. Currently this works as every option is tried, but we may want more fine-grained options in the future. In other words, currently we can solve the problem of "try every variant of NCHW[x]c for a given workload," but we do not really support "try every variant of NCHW8c a given workload" efficiently.
Minor issue: TOPI does not have a default declaration for the conv2d_NCHWc op (which seems to be necessary for the tracing portion of task extraction), so there is a copy-pasted from from x86 in this PR.