Skip to content

cblmemo/tvm-async-rule-benchmark

Repository files navigation

See apache/tvm#14009 for more details.

Current Results

Tested under commit afbfb7aa7e43732cb716f8e443df696110be6afc.

Notice: given the stochastic nature of evolutionary search, perfromance might become worse if enable this PR.

Workload: Conv2d NHWC

Shape Mainline TVM Mainline TVM with Async Performance Boost
N=1_H=224_W=224_C=3_K=64_R=7_S=7_STR=2_PAD=3_DIL=1 13838.05219 14687.89452 6.141343581679319%
N=1_H=56_W=56_C=64_K=64_R=1_S=1_STR=1_PAD=0_DIL=1 5398.305085 5613.892553 3.9936140067192905%
N=1_H=56_W=56_C=64_K=64_R=3_S=3_STR=1_PAD=1_DIL=1 11652.96825 13157.88249 12.91442839038028%
N=1_H=56_W=56_C=64_K=256_R=1_S=1_STR=1_PAD=0_DIL=1 10638.8309 11674.68499 9.736540600527816%
N=1_H=56_W=56_C=256_K=64_R=1_S=1_STR=1_PAD=0_DIL=1 8692.32829 9469.264089 8.938178277203573%
N=1_H=56_W=56_C=256_K=128_R=1_S=1_STR=2_PAD=0_DIL=1 4685.767442 5698.19634 21.606469175684712%
N=1_H=28_W=28_C=128_K=128_R=3_S=3_STR=1_PAD=1_DIL=1 9872.787087 10404.60405 5.38669535070061%
N=1_H=28_W=28_C=128_K=512_R=1_S=1_STR=1_PAD=0_DIL=1 9974.281496 10073.31657 0.9929043414276753%
N=1_H=28_W=28_C=512_K=128_R=1_S=1_STR=1_PAD=0_DIL=1 7075.866932 8564.572712 21.039199780135142%
N=1_H=28_W=28_C=512_K=256_R=1_S=1_STR=2_PAD=0_DIL=1 3648.330914 4021.923142 10.240086132713124%
N=1_H=14_W=14_C=256_K=256_R=3_S=3_STR=1_PAD=1_DIL=1 8192.954618 9160.182054 11.805599824451525%
N=1_H=14_W=14_C=256_K=1024_R=1_S=1_STR=1_PAD=0_DIL=1 8008.870153 9362.825279 16.90569456283206%
N=1_H=14_W=14_C=1024_K=256_R=1_S=1_STR=1_PAD=0_DIL=1 5210.062241 6051.208379 16.144646629759908%
N=1_H=14_W=14_C=1024_K=512_R=1_S=1_STR=2_PAD=0_DIL=1 2550.787202 3587.902938 40.65865373586739%
N=1_H=7_W=7_C=512_K=512_R=3_S=3_STR=1_PAD=1_DIL=1 4350.626084 5432.788068 24.873706981617943%
N=1_H=7_W=7_C=512_K=2048_R=1_S=1_STR=1_PAD=0_DIL=1 6672.068026 7663.725217 14.862815953549454%
N=1_H=7_W=7_C=2048_K=512_R=1_S=1_STR=1_PAD=0_DIL=1 3142.564263 4297.988014 36.766909259541826%

Workload: GEMM NN

Shape Mainline TVM Mainline TVM with Async Performance Boost
M=512_N=256_K=640 8678.46 10607.37 22.226408832903555%
M=512_N=384_K=256 8109.13 10290.72 26.902886006267003%
M=512_N=512_K=512 11419.83 14000.86 22.601299669084398%
M=512_N=3072_K=768 19709.39 18351.61 -6.8890006235606425%
M=512_N=768_K=3072 12844.59 13730.88 6.90010346768561%
M=896_N=896_K=896 16149.91 16131.39 -0.11467556165947945%
M=1024_N=1024_K=1024 18842.11 19662.8 4.355616223448428%
M=1152_N=1152_K=1152 15386.79 16736.1 8.769275462913303%
M=1536_N=1536_K=1536 18522.67 18872.06 1.88628313304725%
M=2048_N=2048_K=2048 19515.42 18874.85 -3.282378754851291%
M=3072_N=3072_K=3072 19233.9 19291.42 0.2990553137948975%
M=4096_N=4096_K=4096 17122.17 19259.01 12.479960191961652%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published