Upstream dmlc 20190311 #13

wweic · 2019-03-12T00:41:04Z

git cherry-pick aaad5f988fa729d456ef546b49242794fdc7eabd..master

…pache#2613)

…e#2647)

… pass (apache#2646) * check in * fix typo * fix typo * change message * change message * typo * lint

Recent pylint warngs about import renames with no effect. Remove them.

…#2656)

…pache#2645)

* Add arange op * Update docs * Fix bug * add sanity check in relay and mxnet frontend mapping * lint * nits * pylint * don't allow empty output from arange * Remove empty test for arange * Fix bug and update doc

* i think it works for now? * fix lint * fix 2/3 compat * fix py2 again * fine, i gave up

…d create free variable). (apache#2665)

* Add CONCATENATION to tflite frontend * fix typo * Fix codestyle * Fix code style * simplify convert map * Update

…ache#2569)

…pache#2676)

…ache#2723) * [Relay][Quantization] Speed-aware quantization scheme improvement * Add comment * Add use_stop_fusion to qconfig * Update comment

…pache#2741)

apache#2720) * tile and repeat operator added in rely * fix pylint * fix make warnings * comments addressed * fix lint error * comment addressed

* [relay][frontend] TensorFlow saved model support * Add Examples section * keep one copy of tensorflow_parser in relay

wweic · 2019-03-12T02:34:49Z

CI completed with same errors as before. @hcho3 @zhiics @yongwww

yongwww

LGTM

…generating (apache#5962) * Code migration Start (neo-ai#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (neo-ai#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (neo-ai#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (neo-ai#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (neo-ai#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (neo-ai#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (neo-ai#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (neo-ai#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (neo-ai#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (neo-ai#13) * Add basic tutorial * migrate feature extraction (neo-ai#14) * Add XGBModel & RPCRunnerWarpper (neo-ai#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (neo-ai#16) * add workload registry * update * update * add task scheduler (neo-ai#17) * Add conv2d cuda tutorial with workload registry (neo-ai#18) * add tune_test.py (the old tune_wkl.py) (neo-ai#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (neo-ai#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (neo-ai#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (neo-ai#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (neo-ai#25) * Add Index simplification & API update (neo-ai#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (neo-ai#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (neo-ai#31) * Add tensorize step * State python api update (neo-ai#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (neo-ai#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (neo-ai#32) * Improve relay integration (neo-ai#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (neo-ai#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (neo-ai#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <[email protected]> Co-authored-by: Minmin Sun (孙敏敏) <[email protected]> Co-authored-by: Zhao Wu <[email protected]>

…generating (apache#5962) * Code migration Start (#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (#13) * Add basic tutorial * migrate feature extraction (#14) * Add XGBModel & RPCRunnerWarpper (#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (#16) * add workload registry * update * update * add task scheduler (#17) * Add conv2d cuda tutorial with workload registry (#18) * add tune_test.py (the old tune_wkl.py) (#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (#25) * Add Index simplification & API update (#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (#31) * Add tensorize step * State python api update (#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (#32) * Improve relay integration (#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <[email protected]> Co-authored-by: Minmin Sun (孙敏敏) <[email protected]> Co-authored-by: Zhao Wu <[email protected]>

Duplicate the CompileEngine interface. Refactor the graph_runtime_codegen to invoke the new LowerTE pass More changes Things appear to be working Some tracing to get Relay code to flow through too. Disable some assertions as exp. Tweak printing for now Fix a few bugs: (neo-ai#13) 1. Don't add relay main function to list of lowered TIR functions 2. Don't skip visiting call to relay function in graph runtime codegen Remove debug prints. Start refactoring Split out shared data structures Fix implicit duplicate decl of IsDynamic Clean up handling of name + global prim fn Clean up the code and debug issue introduced by previous hack Clean up the debugging Do C++ lint clean up Update src/relay/backend/graph_executor_codegen.cc Co-authored-by: Chris Sullivan <[email protected]> Clean up handling of external functions Add more error messages More clean up Update src/runtime/graph_executor/graph_executor.cc Co-authored-by: Chris Sullivan <[email protected]> Update src/runtime/graph_executor/graph_executor.cc Co-authored-by: Chris Sullivan <[email protected]> Update src/relay/backend/te_compiler.h Co-authored-by: Haichen Shen <[email protected]> Update src/relay/backend/te_compiler.h Co-authored-by: Haichen Shen <[email protected]> Fix CR More CR Format Fix lowering path for C++ Fix tests Remove uncessary change Clean up a few more things CI fix Fix the default context Fix Fix broken test cases Update Fix WIP Clean up storage data structures WIP WIP Fix build errors Remove TVMLower Fix lint Lint again fix black Move UpdateMainWorkspaceSize into te_compiler.cc Fix link errors Formatting Change UpdateMainWorkspaceSize to return Map<String, FunctionInfo> Workaround for GCC 5 error caused by enums in maps (GCC 5 is on i386 CI) Testing how functions should be named Lint Change how function metadata is updated Attempt to update aot_executor_codegen to use new StaticMemoryPlan instead of storage_device_map Pass memory plan through LowerTE into UpdateMainWorkspaceSize so that we don't need to run GraphPlanMemory an extra time Fix return in UpdateMainWorkspaceSize Lint Try to fix UpdateMainWorkspaceSize Fix construction of static memory plan Clean up code while debugging Adding UpdateWorkspaceSize back Add closure + call to UpdateFunctionMetadata (WIP) UpdateFunctionMetadata builds; weird error with device ctx map though. Not sure if it came from this change or something else Add some debugging of UpdateMainWorkspaceSize Starting to move UpdateFunctionMetadata call to use process_fn infra UWhat target should be passed to UpdateFunctionMetadata? UpdateFunctionMetadata is not workinggg Added some comments about UpdateFunctionMetadata for Jared Fix the creation of function metadata Try another stab at cleaning up the information Fix Port StorageInfo and StaticMemoryPlan data structure (apache#8297) Restoring reshape opt Fix tests Caught a nasty typo from Lily, Map::Set does not mutate Format Disable stupid Google style warning Rebase cleanup Formatting Add docstring for storage info Black Post rebase fix Remove prints Disable assert that doesn't make sense for now Fix lint Add copying attrs from relay node to graph node; still need to figure out how to do this in the case of global vars Work with Lily to fix graph attrs Try to figure out where extra arguments are coming from; fix merge passes the profiling test Clean up Fix profile test Remove debugging Add attributes for BYOC uTVM case Format Dumb typo Another fix for byoc Format Fix last 3 failing tests Format Fix final two test cases Format Fix lint Fix again Fix Fix auto scheduler code Fix issue Address CR comment Format Co-authored-by: Jared Roesch <[email protected]>

jroesch and others added 30 commits March 11, 2019 17:33

Fix fusion bug when call symbol that is not an operator. (apache#2630)

421b927

[RUNTIME][NDArray] Allowing External Libraries to Subclass NDArrays (a…

f409b69

…pache#2613)

Fix pylint 2.2.2 gripes. (apache#2642)

a1c2c43

add MXNet converter for where operator for both NNVM and Relay (apach…

9284d6e

…e#2647)

[Quantization][RELAY] Add check against NCHWc ops in the quantization…

b84379a

… pass (apache#2646) * check in * fix typo * fix typo * change message * change message * typo * lint

Stop pylint complaining about useless import alias. (apache#2655)

8bb160a

Recent pylint warngs about import renames with no effect. Remove them.

Explicitly disable pylint warning subprocess-popen-preexec-fn (apache…

3adb276

…#2656)

[RELAY][PASS]use attribute registration style in the mac count pass (a…

5876fc9

…pache#2645)

[Relay] fix anf for reference and pattern matching (apache#2637)

b199401

fix lint (apache#2649)

f1adf2c

[RELAY/OP] Gradient of relay level1 ops (apache#2633)

e0ec87d

Update community.rst

21f4f2d

[Relay] GNF (apache#2492)

239face

add committer (apache#2661)

0516127

[Relay/TOPI][OP] Add arange op in Relay and TOPI (apache#2621)

e457cd7

* Add arange op * Update docs * Fix bug * add sanity check in relay and mxnet frontend mapping * lint * nits * pylint * don't allow empty output from arange * Remove empty test for arange * Fix bug and update doc

Fix -Wreturn-std-move and -Wself-assign-overloaded (apache#2669)

d555b88

[Relay] add more function to prelude (apache#2660)

5f1e59d

[BUILD] Simplify after bind device type (apache#2670)

722bcc8

[Hybrid Script] Add max_num_threads (apache#2672)

16623e7

* i think it works for now? * fix lint * fix 2/3 compat * fix py2 again * fine, i gave up

fix (apache#2674)

f713ba0

[Relay] fix error in ANF (too agressively inline atomic expression an…

526e692

…d create free variable). (apache#2665)

Add CONCATENATION to tflite frontend, support Inception V3 (apache#2643)

8d66e4d

* Add CONCATENATION to tflite frontend * fix typo * Fix codestyle * Fix code style * simplify convert map * Update

[AUTOTVM][Bugfix] Fix history loader for heterogeneous execution

85dd805

[Graph Runtime] Run_individual for benchmarking individual layers (ap…

a39f27a

…ache#2569)

REGION op removed from topi and added in darkent frontend (apache#2275)

1b70315

yolo reorg op for relay (apache#1941)

d85e780

[Relay] Ensure nested higher-order functions are treated correctly (a…

8332af8

…pache#2676)

[Relay] add more descriptive error for checked_type (apache#2652)

abad345

[Relay] Port param dict save/load from NNVM (apache#2620)

38794e1

add converter for MXNet slice in nnvm and relay (apache#2662)

d5f6064

ajtulloch and others added 17 commits March 11, 2019 17:37

Fix vmlal.s16 code generation for int8 x int8 -> int32 (apache#2748)

be89cc1

revert PR#2420 nms changes (apache#2747)

90197ba

[Relay][Quantization] Speed-aware quantization scheme improvement (ap…

534818c

…ache#2723) * [Relay][Quantization] Speed-aware quantization scheme improvement * Add comment * Add use_stop_fusion to qconfig * Update comment

[RUNTIME][OPENCL] set type_key even when platform is not available (a…

829c179

…pache#2741)

[DLPACK] fix flaky ctypes support (apache#2759)

274c401

Improvements to the conda build (apache#2742)

daf9e80

[COMMUNITY] @kevinthesun -> committer (apache#2760)

6f94a1a

[WIN] Fix a bug in find_llvm when specify llvm-config (apache#2758)

f197307

fix typo in backend interpreter (apache#2752)

0c343c2

[ARITH] Analyzer RewriteSimplifier: add/sub/mul/div/mod (apache#2722)

c8eb7d9

Add the new logical operators to the doc. (apache#2761)

c96bd9a

update relay python api doc (apache#2766)

2fb9f51

[Relay/TOPI][Frontend] Add tile and repeat operators in Relay and TOPI (

145698e

apache#2720) * tile and repeat operator added in rely * fix pylint * fix make warnings * comments addressed * fix lint error * comment addressed

[relay][frontend] TensorFlow saved model support (apache#2586)

d3a8aa9

* [relay][frontend] TensorFlow saved model support * Add Examples section * keep one copy of tensorflow_parser in relay

[Object Detection] Gluoncv SSD support on CPU (apache#2353)

5aa6faa

Implement flop support for int8 models (apache#2776)

0128af8

[Relay] Improve more operator mxnet frontend importer (apache#2772)

cc12f7d

wweic requested review from hcho3, yongwww and zhiics March 12, 2019 02:09

yongwww approved these changes Mar 12, 2019

View reviewed changes

wweic merged commit 717b019 into unstable Mar 12, 2019

wweic deleted the upstream-dmlc-20190311 branch March 12, 2019 03:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream dmlc 20190311 #13

Upstream dmlc 20190311 #13

wweic commented Mar 12, 2019 •

edited

Loading

wweic commented Mar 12, 2019

yongwww left a comment

Upstream dmlc 20190311 #13

Upstream dmlc 20190311 #13

Conversation

wweic commented Mar 12, 2019 • edited Loading

wweic commented Mar 12, 2019

yongwww left a comment

Choose a reason for hiding this comment

wweic commented Mar 12, 2019 •

edited

Loading