Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI][x86] Injective schedule improvement #4786

Merged
merged 3 commits into from
Feb 4, 2020
Merged

Conversation

anijain2305
Copy link
Contributor

@anijain2305 anijain2305 commented Jan 29, 2020

While working on quantized mobilenet V2, I saw that pad operator was taking around 25% of total time on cascade lake machine. This PR optimizes the injective schedule by performing vectorization

For following test

Before PR - 80 us
After PR - 5 us

import numpy as np
import tvm
from tvm import relay
from tvm.relay.op import register_pattern, OpPattern
from tvm.contrib import graph_runtime
from tvm.contrib.debugger import debug_runtime

dtype='uint8'
dshape=(1, 6, 114, 114, 16)

x1 = relay.var("x1", shape=dshape, dtype=dtype)
x2 = relay.nn.pad(x1, pad_width=((0, 0), (0, 0), (1, 1), (1, 1), (0, 0)))

func = relay.Function([x1], x2)
mod = relay.Module.from_expr(func)

with relay.build_config(opt_level=3):
    graph, lib, params = relay.build(mod, target="llvm -mcpu=cascadelake")

ctx = tvm.cpu()
# module = graph_runtime.create(graph, lib, ctx)
module = debug_runtime.create(graph, lib, ctx)
module.run()

@yzhliu @vinx13 @shoubhik @yidawang please review

@anijain2305 anijain2305 changed the title [TOPI][x86] Pad schedule improvment. [WIP] [TOPI][x86] Pad schedule improvement Jan 29, 2020
@anijain2305
Copy link
Contributor Author

Please do not merge, running few more performance tests

@anijain2305
Copy link
Contributor Author

Update - Interesting observation. Even though the single pad operator sees a large speedup with this PR, the operators that follow pad sees a consistent slowdown in the original graph. I think the reason is that h and w are spread across cores, causing data transfer issues for the second operator.

Will try a few more options. If nothing works, I will close the PR

@anijain2305 anijain2305 changed the title [WIP] [TOPI][x86] Pad schedule improvement [WIP] [TOPI][x86] Injective schedule improvement Feb 4, 2020
@anijain2305
Copy link
Contributor Author

@yzhliu @tqchen

One test fails with this

E           tvm._ffi.base.TVMError: Traceback (most recent call last):
E             [bt] (8) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::NodeFunctor<tvm::tir::Stmt (tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*) const+0xf3) [0x7f9849db3153]
E             [bt] (7) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)#9}::__invoke(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)+0x13) [0x7f9849db5283]
E             [bt] (6) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtMutator::VisitStmt_(tvm::tir::ProducerConsumerNode const*)+0x25) [0x7f984a041eb5]
E             [bt] (5) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtMutator::VisitStmt(tvm::tir::Stmt const&)+0x2b) [0x7f9849db2c6b]
E             [bt] (4) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::VisitStmt(tvm::tir::Stmt const&)+0x30) [0x7f9849db2eb0]
E             [bt] (3) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::NodeFunctor<tvm::tir::Stmt (tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*) const+0xf3) [0x7f9849db3153]
E             [bt] (2) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)#4}::__invoke(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)+0x13) [0x7f9849db4db3]
E             [bt] (1) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::LoopVectorizer::VisitStmt_(tvm::tir::ForNode const*)+0x1c3) [0x7f9849fd1553]
E             [bt] (0) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7f9849d4dd52]
E             File "/home/ubuntu/workplace/tvm/t1/tvm/src/tir/pass/vectorize_loop.cc", line 528
E           TVMError: Failed to vectorize loop with extent {n0|n0>=0}

Is this expected?

@anijain2305
Copy link
Contributor Author

Yizhi helped. Vectorize works only with const extents. Added a split to make it work.

@anijain2305 anijain2305 changed the title [WIP] [TOPI][x86] Injective schedule improvement [TOPI][x86] Injective schedule improvement Feb 4, 2020
@yzhliu
Copy link
Member

yzhliu commented Feb 4, 2020

looks good to me. Thanks @anijain2305

@yzhliu yzhliu merged commit 4a39e52 into apache:master Feb 4, 2020
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020
* [TOPI][x86] Injective Schedule Improvement.

* Add tiling.

* Vectorize when there is an axis.
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020
* [TOPI][x86] Injective Schedule Improvement.

* Add tiling.

* Vectorize when there is an axis.
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020
* [TOPI][x86] Injective Schedule Improvement.

* Add tiling.

* Vectorize when there is an axis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants