[TFLite] Convert TFLite NCHW to NHWC #3141

FrozenGene · 2019-05-07T16:15:22Z

Background RFC: #2519

As discussed in above RFC, we agree to make TFLite frontend input layout from NCHW to NHWC. This PR does this.

Affected:
We could not use Auto TVM to tuning on ARM CPU. Because ARM CPU schedule only implement NCHW currently. We should add SpatialPack + NHWC for conv2d / depthwise convolution on ARM CPU.

@srkreddy1238

cc: @songqun @gomida @ariwaranosai @tqchen @yzhliu

FrozenGene · 2019-05-14T08:45:00Z

ping @srkreddy1238

srkreddy1238 · 2019-05-15T02:35:08Z

@FrozenGene overall look good to me.

Trigger the CI and I will have another look.

FrozenGene · 2019-05-16T07:48:32Z

@srkreddy1238 CI is green now.

FrozenGene · 2019-05-20T05:54:39Z

@srkreddy1238 Could you help to have a look now? Thanks.

srkreddy1238

Thanks @FrozenGene . LGTM.

apivovarov · 2019-06-07T06:40:06Z

I tried to compile the model for ARM Mali GPU. The compilation failed - topi/arm_cpu/conv2d.py _decl_spatial_pack Only support NCHW

target = tvm.target.mali('rk3399')
target_host = 'llvm -target=aarch64-linux-gnu'

Traceback (most recent call last):
  File "./compile.py", line 83, in <module>
    graph, lib, params = relay.build(func, target, target_host=target_host, params=params)
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/relay/build_module.py", line 284, in build
    params)
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/relay/build_module.py", line 112, in build
    self._build(func, target, target_host)
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 209, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5a08e1) [0x7f89de2fc8e1]
  [bt] (7) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x59f8df) [0x7f89de2fb8df]
  [bt] (6) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5a492e) [0x7f89de30092e]
  [bt] (5) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5a46d3) [0x7f89de3006d3]
  [bt] (4) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x59a024) [0x7f89de2f6024]
  [bt] (3) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5a6897) [0x7f89de302897]
  [bt] (2) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x59a024) [0x7f89de2f6024]
  [bt] (1) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x5a64f6) [0x7f89de3024f6]
  [bt] (0) /usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/libtvm.so(+0x95d17b) [0x7f89de6b917b]
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 71, in cfun
    rv = local_pyfunc(*pyargs)
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/relay/op/nn/_nn.py", line 122, in compute_conv2d
    dilation, layout, out_dtype=out_dtype)
  File "</usr/local/lib/python3.5/dist-packages/decorator-4.3.2-py3.5.egg/decorator.py:decorator-gen-19>", line 2, in conv2d
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/target.py", line 372, in dispatch_func
    return dispatch_dict[k](*args, **kwargs)
  File "</usr/local/lib/python3.5/dist-packages/decorator-4.3.2-py3.5.egg/decorator.py:decorator-gen-162>", line 2, in config_dispatcher
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/dispatcher.py", line 215, in dispatch_func
    return dispatch_dict['direct'](cfg, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tvm-0.6.dev0-py3.5-linux-x86_64.egg/tvm/autotvm/task/topi_integration.py", line 344, in template_call
    node = f(cfg, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/topi-0.6.dev0-py3.5.egg/topi/mali/conv2d.py", line 72, in conv2d_mali
    num_tile=3)
  File "/usr/local/lib/python3.5/dist-packages/topi-0.6.dev0-py3.5.egg/topi/arm_cpu/conv2d.py", line 134, in _decl_spatial_pack
    assert layout == "NCHW", "Only support NCHW"
AssertionError: Only support NCHW

apivovarov · 2019-06-07T07:24:12Z

The same issue exists on ARM CPU, e.g. tvm.target.arm_cpu('rk3399')

  File "/usr/local/lib/python3.5/dist-packages/topi-0.6.dev0-py3.5.egg/topi/arm_cpu/conv2d.py", line 76, in conv2d_arm_cpu
    num_tile=2)
  File "/usr/local/lib/python3.5/dist-packages/topi-0.6.dev0-py3.5.egg/topi/arm_cpu/conv2d.py", line 134, in _decl_spatial_pack
    assert layout == "NCHW", "Only support NCHW"
AssertionError: Only support NCHW

FrozenGene · 2019-06-07T09:02:27Z

@apivovarov As described this thread, this PR will affect the Auto Tuning. We should add spatial_pack for NHWC.

apivovarov · 2019-06-07T15:57:56Z

Why this PR was merged before adding spatial_pack for NHWC?
This PR affects not only Auto Tuning. It affects main TVM functionality - compilation.
I can not even compile simple mobilenet TFLite model for arm_mali and arm_cpu now.
It there any workaround for the problem which can help me to compile TFLite model for arm?
Should we add parameter layout=NHWC/NCHW similar to what from_tensorflow is doing?

FrozenGene · 2019-06-07T16:06:17Z

This PR doesn't affect functionality, we could run generic / x86 CPU target well. If you compile Tensorflow model (not TFLite model) on ARM CPU, you will also get the same result. Please see the background discussion link of this PR (#2519), I have listed the advantages / disadvantages. I know your concern and this is why I support TFLite in NCHW layout firstly, but however, make TFLite frontend be NHWC is also discussed before and we know what it will be affected. Support spatial pack NHWC is also not difficult. You could just modify it by yourself currently.

FrozenGene · 2019-06-07T16:14:02Z

Why this PR was merged before adding spatial_pack for NHWC?
This PR affects not only Auto Tuning. It affects main TVM functionality - compilation.
I can not even compile simple mobilenet TFLite model for arm_mali and arm_cpu now.
It there any workaround for the problem which can help me to compile TFLite model for arm?
Should we add parameter layout=NHWC/NCHW similar to what from_tensorflow is doing?

@apivovarov
I think you could revert this PR if you want to auto tuning on ARM CPU. Because I support TFLite in NCHW firstly and handle logic internally.

I don't think we should add layout=NHWC/NCHW, because we will introduce transpose op and affect the performance. We should support spatial pack NHWC.

kevinthesun · 2019-06-07T18:05:05Z

@FrozenGene Will data-related transpose op only appear at the beginning and the end of the network? I think if we can just keep the necessary layout transformations, the performance shouldn't be affected. This is probably a more maintainable way than adding another layout optimization in topi and autotvm.

apivovarov · 2019-06-07T18:05:14Z

I noticed that in tensorflow/test_forward.py unit tests we test for two targets - cuda and llmv
https://github.com/dmlc/tvm/blob/master/tests/python/frontend/tensorflow/test_forward.py#L1273

But in tflite/test_forward.py tests we test only for llvm
https://github.com/dmlc/tvm/blob/master/tests/python/frontend/tflite/test_forward.py#L137

FrozenGene · 2019-06-07T18:16:46Z

@FrozenGene Will data-related transpose op only appear at the beginning and the end of the network? I think if we can just keep the necessary layout transformations, the performance shouldn't be affected. This is probably a more maintainable way than adding another layout optimization in topi and autotvm.

@kevinthesun Not. It will appear before and after every convolution op. Because like our Tensorflow's implementation, we just simply insert transpose before and after conv. I think we should add spatial pack NHWC for ARM CPU.

@apivovarov The reason is NCHW is better for CUDA target.

kevinthesun · 2019-06-07T18:42:46Z

@FrozenGene Got it. In the long term we can add a graph level layout transformation pass. For now we can replace conv2d NHWC with conv2d NCHW in conv2d_later_layout. This should achieve better performance than originla implementation.

FrozenGene · 2019-06-08T10:30:59Z

@FrozenGene Got it. In the long term we can add a graph level layout transformation pass. For now we can replace conv2d NHWC with conv2d NCHW in conv2d_later_layout. This should achieve better performance than originla implementation.

However, this will affect auto tuning. And in fact, we have spatial_pack NHWC and spatial_pack NCHW already. Please see: https://github.com/dmlc/tvm/blob/master/topi/python/topi/nn/bitserial_conv2d.py#L212-L318 Their logic are very similar and we could support ARM CPU conv2d / depthwise_conv2d NHWC / NCHW like bitserial_conv2d.

* Convert TFLite NCHW to NHWC * Minor comment fix

chenbiaolong · 2019-07-25T05:59:50Z

@FrozenGene @apivovarov any update for NHWC support for ARM CPU? And I think we need more examples about tflite for ARM CPU, like some network in arm_cpu_imagenet_bench.py.

FrozenGene · 2019-07-25T06:02:53Z

@chenbiaolong I haven't spare time doing this currently. I wish someone could help to it, I could help to review it. I think it is not difficult. Could refer https://github.com/dmlc/tvm/blob/master/topi/python/topi/nn/bitserial_conv2d.py#L212-L318 how to do it.

FrozenGene · 2019-07-25T06:07:28Z

CC @ajtulloch if you have any interest and time.

tmoreau89 · 2019-08-29T17:37:41Z

@FrozenGene has anyone followed up on NHWC AutoTVM templates for x86/ARM? If not I can give that a stab.

FrozenGene · 2019-08-30T02:46:27Z

Hi @tmoreau89 , my colleague @jackwish is working on arm and will pr soon. You are welcome to do x86. :-)

zhenhuaw-me · 2019-08-30T03:35:37Z

Hi @tmoreau89 and @FrozenGene Drafted the ARM NHWC schedule PR #3859.

tmoreau89 · 2019-08-30T04:30:10Z

Nice, thanks for sharing your draft @jackwish . I have given x86 support a try in this WIP PR: #3858

zhenhuaw-me · 2019-08-30T06:29:25Z

Thank you Thierry, I have seen it :) Thierry Moreau <[email protected]> 于2019年8月30日周五下午12:30写道：

…

Nice, thanks for sharing your draft @jackwish <https://github.com/jackwish> . I have given x86 support a try in this WIP PR: #3858 <#3858> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3141>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABFVHDOYWJYEHC2ORS2PUYTQHCO6PANCNFSM4HLKMLYA> .

FrozenGene force-pushed the tflite_nhwc branch from ba85c70 to ee721de Compare May 7, 2019 17:21

FrozenGene added 2 commits May 8, 2019 01:55

Convert TFLite NCHW to NHWC

377efa5

Minor comment fix

50b9138

FrozenGene mentioned this pull request May 15, 2019

[RFC][Quantization] Support quantized models from TensorflowLite #2351

Closed

4 tasks

FrozenGene force-pushed the tflite_nhwc branch from ee721de to 50b9138 Compare May 15, 2019 14:07

tqchen added the status: need review label May 16, 2019

srkreddy1238 approved these changes May 22, 2019

View reviewed changes

srkreddy1238 merged commit b63267b into apache:master May 22, 2019

tqchen added status: accepted and removed status: need review labels Jun 7, 2019

mshawcroft mentioned this pull request Jun 10, 2019

TFLite Tutorial broken by recent commit. #3332

Closed

wweic pushed a commit to wweic/tvm that referenced this pull request Jun 26, 2019

[TFLite] Convert TFLite NCHW to NHWC (apache#3141)

f279c10

* Convert TFLite NCHW to NHWC * Minor comment fix

wweic pushed a commit to neo-ai/tvm that referenced this pull request Jun 27, 2019

[TFLite] Convert TFLite NCHW to NHWC (apache#3141)

3ccd243

* Convert TFLite NCHW to NHWC * Minor comment fix

tmoreau89 mentioned this pull request Aug 30, 2019

[AutoTVM][TOPI] AutoTVM support for NHWC con2d #3858

Closed

zhenhuaw-me mentioned this pull request Aug 30, 2019

[TOPI][AutoTVM] NHWC conv2d templates for ARM #3859

Merged

FrozenGene deleted the tflite_nhwc branch September 10, 2019 13:41

FrozenGene mentioned this pull request Oct 24, 2019

tflite frontend fail with NCHW FrozenGene/tvm-tutorial#1

Open

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TFLite] Convert TFLite NCHW to NHWC #3141

[TFLite] Convert TFLite NCHW to NHWC #3141

FrozenGene commented May 7, 2019

FrozenGene commented May 14, 2019

srkreddy1238 commented May 15, 2019

FrozenGene commented May 16, 2019

FrozenGene commented May 20, 2019

srkreddy1238 left a comment

apivovarov commented Jun 7, 2019 •

edited

Loading

apivovarov commented Jun 7, 2019 •

edited

Loading

FrozenGene commented Jun 7, 2019

apivovarov commented Jun 7, 2019 •

edited

Loading

FrozenGene commented Jun 7, 2019

FrozenGene commented Jun 7, 2019 •

edited

Loading

kevinthesun commented Jun 7, 2019 •

edited

Loading

apivovarov commented Jun 7, 2019

FrozenGene commented Jun 7, 2019

kevinthesun commented Jun 7, 2019

FrozenGene commented Jun 8, 2019 •

edited

Loading

chenbiaolong commented Jul 25, 2019

FrozenGene commented Jul 25, 2019 •

edited

Loading

FrozenGene commented Jul 25, 2019

tmoreau89 commented Aug 29, 2019

FrozenGene commented Aug 30, 2019

zhenhuaw-me commented Aug 30, 2019

tmoreau89 commented Aug 30, 2019

zhenhuaw-me commented Aug 30, 2019 via email

[TFLite] Convert TFLite NCHW to NHWC #3141

[TFLite] Convert TFLite NCHW to NHWC #3141

Conversation

FrozenGene commented May 7, 2019

FrozenGene commented May 14, 2019

srkreddy1238 commented May 15, 2019

FrozenGene commented May 16, 2019

FrozenGene commented May 20, 2019

srkreddy1238 left a comment

Choose a reason for hiding this comment

apivovarov commented Jun 7, 2019 • edited Loading

apivovarov commented Jun 7, 2019 • edited Loading

FrozenGene commented Jun 7, 2019

apivovarov commented Jun 7, 2019 • edited Loading

FrozenGene commented Jun 7, 2019

FrozenGene commented Jun 7, 2019 • edited Loading

kevinthesun commented Jun 7, 2019 • edited Loading

apivovarov commented Jun 7, 2019

FrozenGene commented Jun 7, 2019

kevinthesun commented Jun 7, 2019

FrozenGene commented Jun 8, 2019 • edited Loading

chenbiaolong commented Jul 25, 2019

FrozenGene commented Jul 25, 2019 • edited Loading

FrozenGene commented Jul 25, 2019

tmoreau89 commented Aug 29, 2019

FrozenGene commented Aug 30, 2019

zhenhuaw-me commented Aug 30, 2019

tmoreau89 commented Aug 30, 2019

zhenhuaw-me commented Aug 30, 2019 via email

apivovarov commented Jun 7, 2019 •

edited

Loading

apivovarov commented Jun 7, 2019 •

edited

Loading

apivovarov commented Jun 7, 2019 •

edited

Loading

FrozenGene commented Jun 7, 2019 •

edited

Loading

kevinthesun commented Jun 7, 2019 •

edited

Loading

FrozenGene commented Jun 8, 2019 •

edited

Loading

FrozenGene commented Jul 25, 2019 •

edited

Loading