[Relay] Improve reduction op layout propagation for packed input #9253

masahi · 2021-10-11T21:51:46Z

Address the issue I raised in #9048 (comment)

So previously, layout_transform is always inserted before reduction ops if the input is in a packed layout:

fn (%x: Tensor[(1, 56, 56, 64), float32], %weight1: Tensor[(32, 64, 3, 3), float32]) -> Tensor[(1, 56, 56, 1), float32] {
  %0 = layout_transform(%x, src_layout="NHWC", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;
  %1 = nn.conv2d(%0, %weight1, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NCHW16c") /* ty=Tensor[(1, 2, 56, 56, 16), float32] */;
  %2 = layout_transform(%1, src_layout="NCHW16c", dst_layout="NCHW")
  relay.sum(%2, axis=[1], keepdims=True)
}

After this PR, layout_transform is pushed to happen after reduce ops:

fn (%x: Tensor[(1, 56, 56, 64), float32], %weight1: Tensor[(32, 64, 3, 3), float32]) -> Tensor[(1, 56, 56, 1), float32] {
  %0 = layout_transform(%x, src_layout="NHWC", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;
  %1 = nn.conv2d(%0, %weight1, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NCHW16c") /* ty=Tensor[(1, 2, 56, 56, 16), float32] */;
  %2 = sum(%1, axis=[1, 4], keepdims=True) /* ty=Tensor[(1, 1, 56, 56, 1), float32] */;
  layout_transform(%2, src_layout="NCHW1c", dst_layout="NHWC") /* ty=Tensor[(1, 56, 56, 1), float32] */
}

I believe the latter one is better because layout_transform can potentially be fused with other ops following reduce ops, while in the former case layout_transform always happen on a bigger input. For example, in the efficientnet_v2 model, all layout_transform after mean ops are fused with other injective ops like this, which is much better than the previous situation where more than 80 naked layout_transform ops are inserted before every mean op.

  %696 = fn (%p0397: Tensor[(1, 80, 29, 29, 4), float32], %p1264: Tensor[(80, 1, 3, 3, 1, 4), float32], %p2220: Tensor[(1, 80, 1, 1, 4), float32], hash="fca663e6ef5a6a5d", data_layout="NCHW4c", kernel_layout="OIHW1i4o", Primitive=1, out_layout="NCHW4c") -> Tensor[(1, 80, 14, 14, 4), float32] {
    %523 = nn.contrib_depthwise_conv2d_NCHWc(%p0397, %p1264, strides=[2, 2], padding=[0, 0, 0, 0], groups=320, channels=320, kernel_size=[3, 3], data_layout="NCHW4c", kernel_layout="OIHW1i4o", out_layout="NCHW4c") /* ty=Tensor[(1, 80, 14, 14, 4), float32] */;
    %524 = add(%523, %p2220) /* ty=Tensor[(1, 80, 14, 14, 4), float32] */;
    %525 = sigmoid(%524) /* ty=Tensor[(1, 80, 14, 14, 4), float32] */;
    multiply(%524, %525) /* ty=Tensor[(1, 80, 14, 14, 4), float32] */
  };
  %697 = %696(%695, meta[relay.Constant][50] /* ty=Tensor[(80, 1, 3, 3, 1, 4), float32] */, meta[relay.Constant][51] /* ty=Tensor[(1, 80, 1, 1, 4), float32] */) /* ty=Tensor[(1, 80, 14, 14, 4), float32] */;
  %698 = fn (%p0396: Tensor[(1, 80, 14, 14, 4), float32], Primitive=1, hash="c4b187e088bcc68d") -> Tensor[(1, 80, 1, 1, 4), float32] {
    mean(%p0396, axis=[2, 3], keepdims=True) /* ty=Tensor[(1, 80, 1, 1, 4), float32] */
  };
  %699 = %698(%697) /* ty=Tensor[(1, 80, 1, 1, 4), float32] */;
  ...
  %707 = fn (%p0449: Tensor[(1, 5, 1, 1, 4), float32], %p1290: Tensor[(320, 20, 1, 1), float32], Primitive=1, hash="829185933c2f8114", src_layout="OIHW", dst_layout="OIHW4i4o") -> Tensor[(80, 5, 1, 1, 4, 4), float32] {
    %701 = sigmoid(%p0449) /* ty=Tensor[(1, 5, 1, 1, 4), float32] */;
    %702 = squeeze(%701, axis=[0, 2, 3]) /* ty=Tensor[(5, 4), float32] */;
    %703 = layout_transform(%702, src_layout="C4c", dst_layout="C") /* ty=Tensor[(20), float32] */;
    %704 = expand_dims(%703, axis=1, num_newaxis=2) /* ty=Tensor[(20, 1, 1), float32] */;
    %705 = multiply(%p1290, %704) /* ty=Tensor[(320, 20, 1, 1), float32] */;
    layout_transform(%705, src_layout="OIHW", dst_layout="OIHW4i4o") /* ty=Tensor[(80, 5, 1, 1, 4, 4), float32] */
  };

The logic to determine the correct layout is slightly complicated, hopefully the comments I added help.
cc @comaniac @yzhliu

comaniac

Make sense. LGTM. cc @yzhliu you might want to take a look as well.

masahi · 2021-10-12T09:18:08Z

Thanks @comaniac

…che#9253) * wip * fixed packed dim size logic * fixed test * formatting * fix compile warning

masahi added 4 commits October 12, 2021 06:47

wip

70685dd

fixed packed dim size logic

30e0fdf

fixed test

90d1c98

formatting

6b740ab

masahi requested review from anijain2305, areusch, comaniac, jroesch, junrushao, jwfromm, MarisaKirisame, mbrookhart, merrymercy, slyubomirsky, tqchen, vinx13, wweic, yzhliu, zhiics and ZihengJiang as code owners October 11, 2021 21:51

fix compile warning

6c9094d

comaniac approved these changes Oct 11, 2021

View reviewed changes

masahi merged commit d1967f2 into apache:main Oct 12, 2021

masahi added a commit to Laurawly/tvm-1 that referenced this pull request Oct 14, 2021

[Relay] Improve reduction op layout propagation for packed input (apa…

fa36389

…che#9253) * wip * fixed packed dim size logic * fixed test * formatting * fix compile warning

junrushao mentioned this pull request Nov 2, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022

[Relay] Improve reduction op layout propagation for packed input (apa…

a405abb

…che#9253) * wip * fixed packed dim size logic * fixed test * formatting * fix compile warning

ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022

[Relay] Improve reduction op layout propagation for packed input (apa…

345c9b9

…che#9253) * wip * fixed packed dim size logic * fixed test * formatting * fix compile warning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay] Improve reduction op layout propagation for packed input #9253

[Relay] Improve reduction op layout propagation for packed input #9253

masahi commented Oct 11, 2021 •

edited

Loading

comaniac left a comment

masahi commented Oct 12, 2021

[Relay] Improve reduction op layout propagation for packed input #9253

[Relay] Improve reduction op layout propagation for packed input #9253

Conversation

masahi commented Oct 11, 2021 • edited Loading

comaniac left a comment

Choose a reason for hiding this comment

masahi commented Oct 12, 2021

masahi commented Oct 11, 2021 •

edited

Loading