Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalOptimization] 1x1 filter convolutions not converted to matmul #18710

Open
Max191 opened this issue Oct 7, 2024 · 3 comments · May be fixed by #18736
Open

[GlobalOptimization] 1x1 filter convolutions not converted to matmul #18710

Max191 opened this issue Oct 7, 2024 · 3 comments · May be fixed by #18736
Assignees

Comments

@Max191
Copy link
Contributor

Max191 commented Oct 7, 2024

The Convert1x1FilterConvToMatmul pass currently fails when there is a non-unit batch N dimension. In such cases, the transformation is still possible, and the N dimension should be folded into the M dimension of the matmul. For example:

// RUN: iree-opt --split-input-file -iree-global-opt-convert-1x1-filter-conv2d-to-matmul %s
util.func public @main(%arg0: tensor<2x128x128x960xf16>, %arg1: tensor<1x1x960x320xf16>) -> tensor<2x128x128x320xf32> {
    %cst = arith.constant 0.0 : f32
    %9 = tensor.empty() : tensor<2x128x128x320xf32>
    %10 = linalg.fill ins(%cst : f32) outs(%9 : tensor<2x128x128x320xf32>) -> tensor<2x128x128x320xf32>
    %11 = linalg.conv_2d_nhwc_hwcf {dilations = dense<1> : vector<2xi64>, strides = dense<1> : vector<2xi64>} ins(%arg0, %arg1 : tensor<2x128x128x960xf16>, tensor<1x1x960x320xf16>) outs(%10 : tensor<2x128x128x320xf32>) -> tensor<2x128x128x320xf32>
    util.return %11 : tensor<2x128x128x320xf32>
}

The N dimension of 2 can be collapsed into the M dimension along with the H and W dimensions, and become:

%7 = linalg.matmul ins(%3, %4 : tensor<32768x960xf16>, tensor<960x320xf16>) outs(%6 : tensor<32768x320xf32>) -> tensor<32768x320xf32>

Similarly, for NCHW convolutions, the N dimension of the convolution can be folded into the N dimension of the matmul.

This is addressing the TODO here:

// TODO(ataei): Support conversion to linalg.batch_matmul.

Dynamic N

The current implementation also bails on dynamic H or W dimensions:

// We cannot merge the width and height if they are both dynamic as we
// cannot expand them back to their dynamic values.
if (isInputHWDynamic)
return failure();

This should be possible now that tensor.expand_shape carries the result shape, since the dynamic output sizes are able to be recovered from the expand_shape result sizes. It would also be good to support this dynamic case while adding support for non-unit N dimensions.

@IanWood1 IanWood1 linked a pull request Oct 10, 2024 that will close this issue
@IanWood1
Copy link
Contributor

For nchw ops, I'm not sure how this would get converted to a matmul because the N dimension of the convolution is not contiguous with the N dimension of the matmul. Does this need a reshape to transpose dims 0 <-> 1? Or (as the TODO comment suggests) does %filter need to be broadcasted to match the N dimension of %input and then preform a batch_matmul?

%1 = linalg.conv_2d_nchw_fchw {
    dilations = dense<1> : tensor<2xi64>,
    strides = dense<1> : tensor<2xi64>} 
    ins(%input, %filter : tensor<5x2x4x5xf32>, tensor<7x2x1x1xf32>) outs(%0 : tensor<5x7x4x5xf32>) -> tensor<5x7x4x5xf32>

@Max191
Copy link
Contributor Author

Max191 commented Oct 11, 2024

For nchw ops, I'm not sure how this would get converted to a matmul because the N dimension of the convolution is not contiguous with the N dimension of the matmul. Does this need a reshape to transpose dims 0 <-> 1? Or (as the TODO comment suggests) does %filter need to be broadcasted to match the N dimension of %input and then preform a batch_matmul?

%1 = linalg.conv_2d_nchw_fchw {
dilations = dense<1> : tensor<2xi64>,
strides = dense<1> : tensor<2xi64>}
ins(%input, %filter : tensor<5x2x4x5xf32>, tensor<7x2x1x1xf32>) outs(%0 : tensor<5x7x4x5xf32>) -> tensor<5x7x4x5xf32>

In this case, I think it is best to create a linalg.generic with multiple N contraction dimensions. Here is an example of something similar:
https://github.com/Max191/iree/blob/a7fd3a33c1b80a599d75e2bdac413f1883f8608d/compiler/src/iree/compiler/Dialect/LinalgExt/Transforms/ConvertConv2DToIm2ColOp.cpp#L282-L308

@Max191 Max191 linked a pull request Oct 11, 2024 that will close this issue
@qedawkins
Copy link
Contributor

What the conv_* to matmul pass should do is not collapse dimensions. It should simply be doing a rewrite of the indexing maps. In particular we have an indexing map of the form

affine_map<(d0, d1) -> (<dilation> * d0 + <stride> * d1)>

All we are doing is looking at the unit kernel loops (in this case d1) and folding the d1 to 0 in the indexing map. Collapsing the op to a matmul should be a separate pattern that is best effort (basically just CollapseDimensions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants