-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to lower to matrix instructions? #569
Comments
xqdan
changed the title
How to lower to matrix instruction?
How to lower to matrix instructions?
Oct 19, 2017
this is a planned feature that is yet to be officially announced so please stay tuned |
I'm interested in this too. Do you have any rough timeline on developing this feature or the required infrastructure? |
All the elements are in, it is mainly effort of documentation and testing |
that's great! |
close this for now and will update when more documents get in |
junrushao
pushed a commit
to junrushao/tvm
that referenced
this issue
Jan 6, 2022
junrushao
added a commit
to junrushao/tvm
that referenced
this issue
Jan 27, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (#3) [Meta Schedule] Update Tune Relay (#4) [Performance Align] fixing codegen problems (#5) [PerfAlign] NRM & SFM on Raspi Aligned (#6) [BugFix] Apply bound predicate directly to loops when possible (#12) [BugFix] Fix CrossThreadReduction on CUDA (#13) [MetaSchedule] Enable BertTuning with MetaScheduler (#11) [Minor][MemHammer] Minor tweaks in code review (#14) [Meta Schedule] Add customizable search space to PostOrderApply. (#16) Fix cooperative fetching (#17) Fixes for codegen (#18) [Hotfix] A unittest (#19) Fix for GRP sketch gen (#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22) [MemHammer][Refactor] Code Review (#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
Hzfengsy
added a commit
to Hzfengsy/tvm
that referenced
this issue
Feb 2, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this issue
Feb 12, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> fix some fixes fix test
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this issue
Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> fix some fixes fix test
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this issue
Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]> fix some fixes fix test
junrushao
added a commit
to junrushao/tvm
that referenced
this issue
Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (#3) [Meta Schedule] Update Tune Relay (#4) [Performance Align] fixing codegen problems (#5) [PerfAlign] NRM & SFM on Raspi Aligned (#6) [BugFix] Apply bound predicate directly to loops when possible (#12) [BugFix] Fix CrossThreadReduction on CUDA (#13) [MetaSchedule] Enable BertTuning with MetaScheduler (#11) [Minor][MemHammer] Minor tweaks in code review (#14) [Meta Schedule] Add customizable search space to PostOrderApply. (#16) Fix cooperative fetching (#17) Fixes for codegen (#18) [Hotfix] A unittest (#19) Fix for GRP sketch gen (#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22) [MemHammer][Refactor] Code Review (#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (#24) Import & Cache Mechanism (#26) [BugFix] Fix Winograd Test Script (#25) Add task extraction & caching (#27) A few fixes for task extraction (#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
junrushao
added a commit
to junrushao/tvm
that referenced
this issue
Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (#3) [Meta Schedule] Update Tune Relay (#4) [Performance Align] fixing codegen problems (#5) [PerfAlign] NRM & SFM on Raspi Aligned (#6) [BugFix] Apply bound predicate directly to loops when possible (#12) [BugFix] Fix CrossThreadReduction on CUDA (#13) [MetaSchedule] Enable BertTuning with MetaScheduler (#11) [Minor][MemHammer] Minor tweaks in code review (#14) [Meta Schedule] Add customizable search space to PostOrderApply. (#16) Fix cooperative fetching (#17) Fixes for codegen (#18) [Hotfix] A unittest (#19) Fix for GRP sketch gen (#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22) [MemHammer][Refactor] Code Review (#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (#24) Import & Cache Mechanism (#26) [BugFix] Fix Winograd Test Script (#25) Add task extraction & caching (#27) A few fixes for task extraction (#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
junrushao
added a commit
to junrushao/tvm
that referenced
this issue
Feb 20, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (#3) [Meta Schedule] Update Tune Relay (#4) [Performance Align] fixing codegen problems (#5) [PerfAlign] NRM & SFM on Raspi Aligned (#6) [BugFix] Apply bound predicate directly to loops when possible (#12) [BugFix] Fix CrossThreadReduction on CUDA (#13) [MetaSchedule] Enable BertTuning with MetaScheduler (#11) [Minor][MemHammer] Minor tweaks in code review (#14) [Meta Schedule] Add customizable search space to PostOrderApply. (#16) Fix cooperative fetching (#17) Fixes for codegen (#18) [Hotfix] A unittest (#19) Fix for GRP sketch gen (#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (#22) [MemHammer][Refactor] Code Review (#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (#24) Import & Cache Mechanism (#26) [BugFix] Fix Winograd Test Script (#25) Add task extraction & caching (#27) A few fixes for task extraction (#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this issue
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this issue
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this issue
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this issue
Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this issue
Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, all,
How to lower tensor to matrix instructions, like 16*16 matrix/tensor multiply or add, rather than two nests loops with operations in tvm?
Thanks!
The text was updated successfully, but these errors were encountered: