Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream dmlc 20190311 #13

Merged
merged 93 commits into from
Mar 12, 2019
Merged

Upstream dmlc 20190311 #13

merged 93 commits into from
Mar 12, 2019

Conversation

wweic
Copy link

@wweic wweic commented Mar 12, 2019

git cherry-pick aaad5f988fa729d456ef546b49242794fdc7eabd..master

jroesch and others added 30 commits March 11, 2019 17:33
… pass (apache#2646)

* check in

* fix typo

* fix typo

* change message

* change message

* typo

* lint
Recent pylint warngs about import renames with no effect.  Remove
them.
* Add arange op

* Update docs

* Fix bug

* add sanity check in relay and mxnet frontend mapping

* lint

* nits

* pylint

* don't allow empty output from arange

* Remove empty test for arange

* Fix bug and update doc
* i think it works for now?

* fix lint

* fix 2/3 compat

* fix py2 again

* fine, i gave up
* Add CONCATENATION to tflite frontend

* fix typo

* Fix codestyle

* Fix code style

* simplify convert map

* Update
ajtulloch and others added 17 commits March 11, 2019 17:37
…ache#2723)

* [Relay][Quantization] Speed-aware quantization scheme improvement

* Add comment

* Add use_stop_fusion to qconfig

* Update comment
apache#2720)

* tile and repeat operator added in rely

* fix pylint

* fix make warnings

* comments addressed

* fix lint error

* comment addressed
* [relay][frontend] TensorFlow saved model support

* Add Examples section

* keep one copy of tensorflow_parser in relay
@wweic wweic requested review from hcho3, yongwww and zhiics March 12, 2019 02:09
@wweic
Copy link
Author

wweic commented Mar 12, 2019

CI completed with same errors as before. @hcho3 @zhiics @yongwww

Copy link

@yongwww yongwww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wweic wweic merged commit 717b019 into unstable Mar 12, 2019
@wweic wweic deleted the upstream-dmlc-20190311 branch March 12, 2019 03:23
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jul 27, 2020
…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <[email protected]>
Co-authored-by: Minmin Sun (孙敏敏) <[email protected]>
Co-authored-by: Zhao Wu <[email protected]>
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <[email protected]>
Co-authored-by: Minmin Sun (孙敏敏) <[email protected]>
Co-authored-by: Zhao Wu <[email protected]>
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <[email protected]>
Co-authored-by: Minmin Sun (孙敏敏) <[email protected]>
Co-authored-by: Zhao Wu <[email protected]>
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Sep 2, 2020
…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <[email protected]>
Co-authored-by: Minmin Sun (孙敏敏) <[email protected]>
Co-authored-by: Zhao Wu <[email protected]>
trevor-m pushed a commit that referenced this pull request Sep 3, 2020
…generating (apache#5962)

* Code migration Start (#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (#13)

* Add basic tutorial

* migrate feature extraction (#14)

* Add XGBModel & RPCRunnerWarpper (#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (#16)

* add workload registry

* update

* update

* add task scheduler (#17)

* Add conv2d cuda tutorial with workload registry (#18)

* add tune_test.py (the old tune_wkl.py) (#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (#25)

* Add Index simplification & API update (#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (#31)

* Add tensorize step

* State python api update (#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (#32)

* Improve relay integration (#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <[email protected]>
Co-authored-by: Minmin Sun (孙敏敏) <[email protected]>
Co-authored-by: Zhao Wu <[email protected]>
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jul 13, 2021
Duplicate the CompileEngine interface.

Refactor the graph_runtime_codegen to invoke the new LowerTE pass

More changes

Things appear to be working

Some tracing to get Relay code to flow through too.

Disable some assertions as exp.

Tweak printing for now

Fix a few bugs: (neo-ai#13)

1. Don't add relay main function to list of lowered TIR functions
2. Don't skip visiting call to relay function in graph runtime codegen

Remove debug prints.

Start refactoring

Split out shared data structures

Fix implicit duplicate decl of IsDynamic

Clean up handling of name + global prim fn

Clean up the code and debug issue introduced by previous hack

Clean up the debugging

Do C++ lint clean up

Update src/relay/backend/graph_executor_codegen.cc

Co-authored-by: Chris Sullivan <[email protected]>

Clean up handling of external functions

Add more error messages

More clean up

Update src/runtime/graph_executor/graph_executor.cc

Co-authored-by: Chris Sullivan <[email protected]>

Update src/runtime/graph_executor/graph_executor.cc

Co-authored-by: Chris Sullivan <[email protected]>

Update src/relay/backend/te_compiler.h

Co-authored-by: Haichen Shen <[email protected]>

Update src/relay/backend/te_compiler.h

Co-authored-by: Haichen Shen <[email protected]>

Fix

CR

More CR

Format

Fix lowering path for C++

Fix tests

Remove uncessary change

Clean up a few more things

CI fix

Fix the default context

Fix

Fix broken test cases

Update

Fix

WIP

Clean up storage data structures

WIP

WIP

Fix build errors

Remove TVMLower

Fix lint

Lint again

fix black

Move UpdateMainWorkspaceSize into te_compiler.cc

Fix link errors

Formatting

Change UpdateMainWorkspaceSize to return Map<String, FunctionInfo>

Workaround for GCC 5 error caused by enums in maps (GCC 5 is on i386 CI)

Testing how functions should be named

Lint

Change how function metadata is updated

Attempt to update aot_executor_codegen to use new StaticMemoryPlan instead of storage_device_map

Pass memory plan through LowerTE into UpdateMainWorkspaceSize so that we don't need to run GraphPlanMemory an extra time

Fix return in UpdateMainWorkspaceSize

Lint

Try to fix UpdateMainWorkspaceSize

Fix construction of static memory plan

Clean up code while debugging

Adding UpdateWorkspaceSize back

Add closure + call to UpdateFunctionMetadata (WIP)

UpdateFunctionMetadata builds; weird error with device ctx map though. Not sure if it came from this change or something else

Add some debugging of UpdateMainWorkspaceSize

Starting to move UpdateFunctionMetadata call to use process_fn infra

UWhat target should be passed to UpdateFunctionMetadata?

UpdateFunctionMetadata is not workinggg

Added some comments about UpdateFunctionMetadata for Jared

Fix the creation of function metadata

Try another stab at cleaning up the information

Fix

Port StorageInfo and StaticMemoryPlan data structure (apache#8297)

Restoring reshape opt

Fix tests

Caught a nasty typo from Lily, Map::Set does not mutate

Format

Disable stupid Google style warning

Rebase cleanup

Formatting

Add docstring for storage info

Black

Post rebase fix

Remove prints

Disable assert that doesn't make sense for now

Fix lint

Add copying attrs from relay node to graph node; still need to figure out how to do this in the case of global vars

Work with Lily to fix graph attrs

Try to figure out where extra arguments are coming from; fix merge

passes the profiling test

Clean up

Fix profile test

Remove debugging

Add attributes for BYOC uTVM case

Format

Dumb typo

Another fix for byoc

Format

Fix last 3 failing tests

Format

Fix final two test cases

Format

Fix lint

Fix again

Fix

Fix auto scheduler code

Fix issue

Address CR comment

Format

Co-authored-by: Jared Roesch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.