Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QNN] Conv2D operator #3580

Merged
merged 1 commit into from
Sep 4, 2019
Merged

[QNN] Conv2D operator #3580

merged 1 commit into from
Sep 4, 2019

Conversation

anijain2305
Copy link
Contributor

@anijain2305 anijain2305 commented Jul 18, 2019

Lowering of QNN Conv2D operation. We break the convolution into 4 terms as described in Option 1 here. Other relevant discussion is present at #2351

@anijain2305
Copy link
Contributor Author

anijain2305 commented Jul 18, 2019

cc @FrozenGene @tqchen @yzhliu

@anijain2305 anijain2305 changed the title [QNN] Convolution 2D Implementation. [QNN] Conv2D operator Jul 18, 2019
@anijain2305 anijain2305 changed the title [QNN] Conv2D operator [QNN] WIP - Conv2D operator Jul 18, 2019
@anijain2305 anijain2305 force-pushed the qnn_conv2d branch 3 times, most recently from 4f6e6bf to 52a3c2b Compare July 23, 2019 23:29
}
}

/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrozenGene @jackwish @u99127

While the Requantize and Legalize pass are going through final details, it will be useful to prefetch and look at QNN conv2d lowering. Please review and let me know your comments.

@anijain2305 anijain2305 changed the title [QNN] WIP - Conv2D operator [QNN] Conv2D operator Aug 2, 2019
@anijain2305 anijain2305 force-pushed the qnn_conv2d branch 7 times, most recently from c35d314 to 020c123 Compare August 7, 2019 22:26
Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for ping. Suggesting to have if {NHWC} elif {NCHW} else {assert} when handling different memory layout. And, I am not sure dividing the compute into 4 terms is optimization-friendly, but that's out of this PR scope I think.

ps. Only looked into part of this PR, will recheck when the requantize op were merged. Feel free to ping please.

docs/langref/relay_op.rst Show resolved Hide resolved
docs/langref/relay_op.rst Show resolved Hide resolved
input_scale of the input quantized tensors. The zero point of the output
quantized tensor is 0. By default, the dtype of output is int32. Please also
refer to Requantize operator to understand how to scale back the int32
ouptut to (u)int8.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :)

out_dtype="int32"):
r"""Quantized 2D convolution.

This operator convolves quantized weight with quantized data. The scale of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be convolves quantized data with quantized weight? I know they are basically the same though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that sounds better.

r"""Quantized 2D convolution.

This operator convolves quantized weight with quantized data. The scale of
the output quantized tensor is the product of the weight_scale and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to align the term weight and kernel to be one of them rather than mixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done to kernel

@@ -33,36 +34,6 @@

namespace tvm {
namespace relay {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are moving code like this to headers, would it be better to have a dedicated PR which involves no extra functionality?

src/relay/qnn/op/convolution.cc Show resolved Hide resolved

const auto in_shape = get_shape(0);
int batch_size, in_channels;
// NCHW layout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check it? Maybe simply if NCHW else if NHWC else assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L354-L358

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd like to discuss the layout handling a bit (won't block the PR). To me, I always prefer code like if condition 1: path 1; elif condition 2: path 2: else (condition 3): path 3 or assert, even if the input guarantees that condition 3 won't happen. I think the path 1 inside a if section is easier to read, and the last assert is to handle unexpected code typo. Anyway, that is out the scope of this PR :).


const auto kernel_shape = get_shape(1);
int out_channels, kernel_h, kernel_w;
// OIHW layout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to input layout handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L354-L358

Array<IndexExpr> pad_w({param->padding[1], param->padding[1]});

Array<Array<IndexExpr>> pad_width;
pad_width = {pad_n, pad_c, pad_h, pad_w};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout check and handling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L354-L358

@anijain2305 anijain2305 force-pushed the qnn_conv2d branch 6 times, most recently from bc5b780 to 699e34c Compare August 9, 2019 00:46
@anijain2305
Copy link
Contributor Author

@u99127 @FrozenGene @jackwish @tqchen This is ready for review.

@anijain2305
Copy link
Contributor Author

@tqchen qnn.conv2d shares infer type functionality with nn.conv2d. Therefore, I have transferred that piece of code from cc to a header file and converted it to template function. It can now take Conv2DAttrs, QnnConv2dAttrs. Let me know if that looks ok.

@anijain2305
Copy link
Contributor Author

Pinging again in case this was missed @u99127 @FrozenGene @jackwish @tqchen

Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI reports that there is merge conflicts, so request changes still...

It is very glad that we are reaching this, the conditional optimization is really beneficial. Thank you for the impressive work @anijain2305 !

src/relay/qnn/op/convolution.cc Show resolved Hide resolved
}
auto reduced_t3 = Sum(Cast(weight, Int(32)), axes_t3, false, false);

// Find the newshape depenging on NCHW/NHWC layout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it depending?


const auto in_shape = get_shape(0);
int batch_size, in_channels;
// NCHW layout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd like to discuss the layout handling a bit (won't block the PR). To me, I always prefer code like if condition 1: path 1; elif condition 2: path 2: else (condition 3): path 3 or assert, even if the input guarantees that condition 3 won't happen. I think the path 1 inside a if section is easier to read, and the last assert is to handle unexpected code typo. Anyway, that is out the scope of this PR :).

@anijain2305 anijain2305 force-pushed the qnn_conv2d branch 2 times, most recently from f5cedbd to 434f40d Compare September 3, 2019 05:25
@anijain2305
Copy link
Contributor Author

@jackwish Thanks for the good words :) I incorporated your comments. Can you please review again?

Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Glad to participate in this, thank you for the great work!

src/relay/qnn/util.h Show resolved Hide resolved
@anijain2305
Copy link
Contributor Author

@zhiics @vinx13 As jackwish has approved, can you please review?

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, only left nit comment.

@@ -415,6 +415,71 @@ static inline Expr Full(Expr fill_value,
return CallNode::make(op, {fill_value}, Attrs(attrs), {});
}

static inline Expr Conv2D(Expr data, Expr weight, Array<IndexExpr> strides,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks this is the same as MakeConv2d, right?
If this is true, should we just keep one signature instead of having duplication. I am not strongly against this because it's obviously used by other cases as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I kept it to follow other usecases. I guess this repetition might be because typically TVM does not want header and implementation linking problems. Will keep it Conv2D for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about this, actually I'm not sure why we prefer to copying this (and the others) here instead of adding declarations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think linking should be fine. We can put TVM_DLL if needed. But anyway, we can keep it this way for now.


/*!
* Copyright (c) 2019 by Contributors
* \file nn.cc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong file name.

*/
WorkloadType GetWorkload(const Array<tvm::relay::Type>& arg_types, const QnnConv2DAttrs* param) {
// Get conv parameters.
auto get_shape = [&](const Type& type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't actually need to capture anything here, right?

// Since, this is integer division (floor), we can first multiply the data by the pool_size and
// then perform avg_pool2d. Reversing this causes inaccuracy due to floor division.
auto scaled_hw_t2 = Multiply(casted_t2, MakeConstantScalar(Int(32), kernel_h * kernel_w));
Array<IndexExpr> padding;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use?

Array<IndexExpr> padding({0, 0})

from tvm.relay.testing import create_workload
from tvm.contrib import graph_runtime

def run_infer_type(expr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_infer_type could be obtained from tvm.relay.testing as well.

@@ -415,6 +415,71 @@ static inline Expr Full(Expr fill_value,
return CallNode::make(op, {fill_value}, Attrs(attrs), {});
}

static inline Expr Conv2D(Expr data, Expr weight, Array<IndexExpr> strides,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about this, actually I'm not sure why we prefer to copying this (and the others) here instead of adding declarations

tests/python/relay/test_qnn_conv2d.py Outdated Show resolved Hide resolved
tests/python/relay/test_qnn_conv2d.py Show resolved Hide resolved
Rebasing. Empty commit.

Clang-format styling.
Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiics zhiics merged commit 0d4870c into apache:master Sep 4, 2019
MarisaKirisame pushed a commit to MarisaKirisame/tvm that referenced this pull request Sep 7, 2019
Rebasing. Empty commit.

Clang-format styling.
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
Rebasing. Empty commit.

Clang-format styling.
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
Rebasing. Empty commit.

Clang-format styling.
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 16, 2019
Rebasing. Empty commit.

Clang-format styling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants