Pattern Language, Matcher, Rewriter, and Function Paritioner #5231

mbrookhart · 2020-04-03T15:14:12Z

C++ Implementation for

https://discuss.tvm.ai/t/rfc-relay-program-matching-for-relay-pt-1-a-pattern-language/5833

Includes the Langauge, a matcher with limited associative/commutative support, a pattern-based rewriter, a pattern-based graph paritioner, along with some documentation, tests, and examples in the tests such as fusing a batchnorm and writing an algebraic simplifier with the infrastructure.

jroesch · 2020-04-03T19:35:35Z

cc @jonso4 you asked and it is delivered 😆

jroesch · 2020-04-03T19:35:58Z

cc @zhiics and @icemelon9

src/relay/ir/dataflow_matcher.cc

tests/python/relay/test_df_pattern.py

include/tvm/relay/dataflow_functor.h

include/tvm/relay/dataflow_pattern.h

src/relay/ir/dataflow_matcher.cc

mbrookhart · 2020-04-15T20:44:07Z

Thank you for the detailed updates, @masahi ! I'm super grateful someone takes time to catch the little typos in the comments I miss.

src/relay/ir/dataflow_matcher.cc

masahi · 2020-04-15T21:22:31Z

I'm super grateful someone takes time to catch the little typos in the comments I miss.

Catching typos takes zero effort, so no problem :)

Currently I'm reading the visit on dominator pattern. Since I like the diamond problem, #1548, I'd like to ask, is arbitrary diamond shape supported? What about a diamond nested within another diamond? Current fusion algo can deal with them.

mbrookhart · 2020-04-15T22:59:06Z

Yes! Fuzzy diamond matching as long as all of the nodes between the parent and the child all match the path pattern.

Just added this unit test, will upstream it with the refactor:

    # Fuzzy path/nested Diamond
    is_conv2d = is_op('nn.conv2d')(wildcard(), wildcard())
    is_unary_elemwise = (wildcard().has_attr("TOpPattern", K_ELEMWISE))(wildcard()) | is_op('add')(wildcard(), wildcard())
    reduction = is_op('add')(wildcard(), wildcard())
    diamond = dominates(is_conv2d, is_unary_elemwise, reduction)

    inp = relay.var('input')
    weight = relay.var('weight')
    conv2d = relay.op.nn.conv2d(inp, weight)
    relu = relay.op.nn.relu(conv2d)
    relu = relu + relu
    tanh = relay.op.tanh(relu)
    leaky_relu = relay.op.nn.leaky_relu(conv2d, alpha=0)
    out = tanh + leaky_relu

    assert diamond.match(out)

Any patterns in particular you want to see tested?

masahi · 2020-04-15T23:33:38Z

Nice!

Any patterns in particular you want to see tested?

No, I don't know if these complicated patterns can come up in practice, but it is great to be future-proof :) Also it is a prereq if we want to replace the current fusion impl with a pattern matching based one.

src/relay/ir/dataflow_matcher.cc

tqchen · 2020-05-15T19:43:08Z

Thanks @mbrookhart @jroesch @mbaret @masahi @yzhliu !

…5231)

* [TFLITE]Select op support for tflite frontend (#5486) * [TFLITE]Select/Where op support for tflite frontend * Review comment fixed * Review comment fixed * [FRONTEND][TFLite] Fully connected op conversion made in sync with TFLite (#5510) * [FRONTEND][TFLite] Fully connected op conversion made in sync with TFLite * [1] Test case added * [2] Review comments handled * [3] Prints removed * [TOPI][Winograd] Optimization of Conv2d Winograd algorithm on Tensor Core (#5485) * Cache PrimExpr instead of raw pointers in bound analyzer (#5533) The objects that the raw pointers point to can be deallocated and new objects can be allocated at the same address, all while these pointers are still in the cache. This can lead to unexpected behavior, for example to calculated bound conflicts with previously cached values. Caching PrimExpr will prevent the objects from being deallocated while the cache is active. * fix a few bugs with shape inference and types in the onnx importer (#5534) * [Frontend][TFLite] ADD_N operator (#5474) * [WEB][RUNTIME] TVM WebAssembly JS Runtime (#5506) * [WEB] Remove the old web runtime * [WEB][RUNTIME] TVM WebAssembly Runtime This PR introduces a brand new TVM web runtime based on the WASM standard API. Main highlights: - The new runtime is rewritten using the Typescript. - The new runtime now directly interfaces with WebAssembly's standard API, instead of relying on emscripten's API. This change will make the js runtime more portable to runtime variants. For example, we could also try to make it interface with the tvm's rust runtime implementation. - System library can be provided through WASI - We also build a hack to enable Emscripten to generate a WASI like bundle for runtime environment on the Web. - The wasm generation now uses the mainlin LLVM. - Dynamic link(dlopen) is not used due to limitation of wasm, instead we rely on the recent new RPC refactor to directly restart a new session for each wasm binary sent to the RPC. * Address review comments * Skip tensorcore test * [RELAY][ONNX]ReduceLogSumExp Operator support (#5453) * [RELAY]LogSumExp Op Support * [ONNX]LogSumExp Op Support * [RPC][BUGFIX] Fix remote device sync (#5538) * [Refactor][std::string --> String] IRModule is updated with String (#5523) * [std::string --> String] IRModule is updated with String * [1] Packedfunction updated * [2] Lint error fixed * [3] Remove std::string variant * [RUNTIME] Store nullptr PackedFunc as nullptr for better error propagation (#5540) * [Relay-TFLite] FP32 and Quantized Object Detection Model (#5479) * TFlite e2e FP32 Object detection model * Fix test * [Relay-TFLite] Quantized activations * Flexbuffer parsing * Lint * Relaxing checks. * Github reviews * comments Co-authored-by: Ubuntu <[email protected]> * Changes to cpp_rpc to make it work on Android (+ Hexagon offloading) (#5535) * Changes to cpp_rpc to make it work on Android (+ Hexagon offloading) - Implement getNextString to break up std::string into words. stringstream just doesn't work on Android. - string::find_last_of doesn't look for the last substring, but the last character from a given string. - Use SIGTERM to terminate processes (this isn't necessary, but using SIGKILL is not a good practice). - Convert "./rpc" to a full path. When a module is uploaded and offloaded to Hexagon, the dlopen on Hexagon needs an absolute path (or a path without directories). * Only set the absolute patch on non-Windows platforms Windows has different macros for the maximum path length. * Add Onnx Pad v11 (#5539) * fix restructured text (#5541) * [CRT]fix to reduce RAM size during loading model (#5507) * [CRT]fix to reduce RAM size during loading model * Release graph_json memory immediately after reading * Load platform specific lib for tvmdsoop instead of only so (#5542) * [RPC] Improve RPCServer AsyncIO support. (#5544) * [RPC] Improve RPCServer AsyncIO support. When the RPCServer is in the async IO mode, it is possible for the server to directly serve async function that may return its value via a callback in the future. This mode is particular useful to the web environment, where blocking is not an option. This PR introduces the Async support to the RPCSession, allowing the AsyncIO driven servers to serve the async functions. These functions will still be presented as synchronized version on the client side. Followup PR will refactor the web runtime to make use of this feature. * Address comments * [Rust] Add first stage of updating and rewriting Rust bindings. (#5526) * Add tvm-sys * Use as_mut_ptr * Address CR feedback * Update rust/tvm-sys/src/datatype.rs Co-authored-by: Nick Hynes <[email protected]> * Final CR comments * Fix find and replace error in frontend Co-authored-by: Nick Hynes <[email protected]> * [TE] Fix MakeLoopNest for warp memory (#5382) * [TIR][Printer] text format printer considering future parsing use (#5483) * [Optimization] Warp level reduction support for CUDA (#5498) - Added the warp level reduction support - Upgraded shfl intrinsics to the sync version. - This is the building block for scheduling softmax like operations. Signed-off-by: Wei Pan <[email protected]> * A clone of test/python/unittest/test_runtime_micro.py, however (#5546) modified to run specifically on ARM cortex-M hardware, which currently is just the STM32F746 discovery board. Signed-off-by: Tom Gall <[email protected]> * [CI] Install wasmtime for WebAssembly tests (#5494) * Apparently, ONNX Conv with no 'pads' defaults to zero padding (#5548) * [WEB] WebGPU support (#5545) This PR introduces WebGPU support to tvm. The WebGPU runtime is directly built in javascript(as WebGPU uses JS as the first class citizen API) and exposes back to the tvm's runtime via PackedFuncs. One important note is that `ctx.sync` is not async. This is due to the fact that WebGPU is a purely async API and we cannot block in the web environment. So the current best way to use the js api is to wrap things in an async function. When copy a GPU array to CPU, `await ctx.sync()` need to be called to wait for copy completion. We use a AsyncIO rpc server to serve the async functions to the clients. * [TOPI][RELAY][TENSORFLOW]Math ops added (#5502) * [TOPI][RELAY][TENSORFLOW]Math ops added * Extra newline removed * CI fix * Review comments fixed * Review comments fixed * [RUNTIME] Hexagon driver for offloading kernels to simulator (#5492) * [RUNTIME] Hexagon driver for offloading kernels to simulator * Add sim_dev as external project when building with Hexagon/sim support * Change target CPU for sim_dev to v60 * [LINT] clang-format the h,cc,m files. (#5557) This PR prepares for our migration to use the clang-format as part of the linter system. * [BYOC, MergeComposite] Add additional check before re-using the cached match (#5552) * Add additional check before re-using the cached match in merge composite * clean up ExtractPattern calls * [WEB] Setup lint, doc, test (#5556) * [CI] Update ci-cpu to bionic (#5555) * [CI] Update ci-cpu to bionic (#5554) * [Fix] Fix conv2d alter op for arm cpu (#5532) * [FRONTEND]onnx, mxnet, pytorch mathops added (#5561) * Fix topi test for tensorcore (#5563) * [Refactor][std::string --> String] IR is updated with String (#5547) * [std::string --> String] GlobalTypeVar is updated with String * [std::string --> String] GlobalVar is updated with String * [std::string --> String][IR] ADT is updated with String * [std::string --> String][IR] OP is updated with String * [std::string --> String][IR] Attrs is updated with String input * [std::string --> String][IR] GlobalVar is updated with String * [std::string --> String][Test] Pyconverter is updated with String change * [DOCKER] Fix vulkansdk in the ci-gpu (#5566) * [CI] reintroduce docker stage for wasm tests (#5565) * [DOCKER] Introduce ci-wasm * Add Jenkinsfile * Rename prepare to prepwasm so it won't run by default * [CI] Update ci-lint to use the latest image that contains clang-format (#5568) * [DOCKER] Add clang-format and nodejs to ci-lint (#5567) * [TARGET] Phase out WebGL (#5570) The graphics API is moving towards next generation. Vulkan/Metal on the native and WebGPU on the web. Due to the limited programming model, we cannot get the best compute performance in WebGL. Now that the mainline already have both WebGPU and vulkan support, this PR phases out WebGL. * [LINT] Enable clang-format. (#5572) * [LINT] Enable clang-format. * Add more docs * [CI] Update the ci-gpu to the lastest build with the new vulkansdk. (#5571) * [Relay] enable blocking format in x86 conv2d and fold scale axis (#5357) * [CI] Fix clang-format error (#5577) * Allow ubuntu_install_darknet.sh to work in both 18.04 and 16.04 (#5574) * [PYTORCH]expand bug fix (#5576) * [CI] Enable llvm-11 and llvm-10 in build tests, recover webdocs. (#5579) This PR ties up the last loosen end of the recent CI update. * [PYTORCH] Support max_pool2d_with_indices (#5549) * Use real output name instead of node_name * Add pytorch max_pool2d_with_indices converter. * Add test for maxpool2d with indices * Add explicit assert for single output * Only consume output (not indices) from max pool 2d with indices * undo change * [Relay] Fixed bug in attribute parsing for pool layers. (#5582) * Fixed pooling bug. * Added tests and fixed more cases. * [RELAY][TF] Support symbolic newshape for Reshape (#5429) * [RELAY][TF] Support symbolic newshape for Reshape * Only need to pass data * Use MakeReshape() in Reshape() * Change newshape to Expr * Create a template for Array<T> * Fuse reshape when newshape is constant * Make newshape Optional * Use bool() of Optional Co-authored-by: Li Xiaoquan <[email protected]> * Add prim::device op (#5584) * Fix the runtime raise error (#5586) * [RELAY][Convert Layout] Specify additional layouts in convert layout pass (#5422) * [RELAY] Specify additional layouts in convert layout pass * This patch means that you can specify an additional layout, rather than using the layout chosen by default during conversion. * This is specifically useful for external codegen when a 3rd party library needs to target a specific kernel layout for example. Change-Id: I3ef9cf45ead574801870a38af9768f93e29aab10 * Use mapping of op name to list of desired layouts Change-Id: Ibd691a3cb93e73a394f36112668ad52a84c7d5a2 * Fix issue with code block Change-Id: Ibb4e38c05ad4312b7dea845be699b8d5d57e0a94 * Address comments, Improve tutorial Change-Id: Ib824eead329d551c338234de3b2d814693afd0ec * Fix linting Change-Id: Ie9e1891f590b3a7496a56ff8362cdda9d4b5fa75 * Test uses NCHW default layout. Unrelated issue with NHWC. Change-Id: I1c16f0db73db56f5e9536db3fe5eb2624c3b595c * Fix mistake in tutorial Change-Id: I944041245d27af262dc96f1cd8117f1f19272062 * Address multiple comments Change-Id: If33a1e34acd8fc37d1c7797ee189a6448a392672 * Improve tutorial Change-Id: Ib04142c94c7958ab5067947d2ff4c84354e3d0c5 * Fix Clang-format Change-Id: Ieff39e3f0817d22579c68b3287e972a3b0fcfbc8 * Add a quantized conv2 unit test for the tflite front-end (#5558) Signed-off-by: Giuseppe Rossini <[email protected]> * [Relay][Transform] Safe check added for Merge Composite (#5562) * [MXNET]abs, round, reciprocal, sign, softsign, hard_sigmoid (#5587) * [Hexagon] One more fix for concurrency count (#5589) * Fix JSON graph dumping. (#5591) * Previously this function placed a JSON-escaped string containing the JSON-encoded graph. * [DOCS] Improve document in reflection (#5593) * Overestimate binary size for microTVM compiled binaries. (#5590) * Overestimate binary size for microTVM compiled binaries. * Currently uTVM binary section sizes are computed by summing the sizes of all symbols in the section. * This method produces errors because it presumes the linker works in a particular way, rather than analyzing the linked output. * As we intend to move away from linking inside TVM (RFC forthcoming), just using this stopgap to make forward progress until then. * address weberlo comments * fix regression (use 64 bit word size) * [TFLite Runtime] Fix bug and re-enable RPC execution test (#5436) * [Relay][VM] Memory planner (part 1) (#5144) * Start on memory planning WIP Move to test_memory_passes.py Work on memory planning Post-rebase and VM changes Plumb through the offsets Basic tests all pass, fix offset to data buffer. Fix compile errors Fix ws Apply suggestions from code review Co-Authored-By: Haichen Shen <[email protected]> Address CR Update src/runtime/vm/vm.cc Co-Authored-By: Haichen Shen <[email protected]> Fix another comment Fix lint Fix Fix Fix Lint is done? Fix More fix Trying to debug No clue Fix lint * Fix docs * Disable aggressive constant eval * It works * Fix lint * Found issue with dynamic * Fix the pass, but runtime segfaults * fix scalar tensor, test_any_elemwise passes * Fix split pass * Fix 0-rank issues * Fix * debug * apply Haichen's patch and clean up * lintgit add . * fix serializer and test_tyck_alloc_tensor test * Fix the constant lift pass in presence of closures * Restore old finder * Fix rebase issues * Fix * Fix * Fix issue coercing the shapes incorrectly from i64 to i32 * Fix linting * Fix clang format * Format memory.cc * Fix 0-rank case * Add fix for (0,) shape * Ignore shapes for now * Apply suggestions from code review Co-authored-by: Zhi <[email protected]> * Update src/runtime/vm/executable.cc Co-authored-by: Zhi <[email protected]> * Fix * lint Co-authored-by: Zhi Chen <[email protected]> Co-authored-by: Zhi <[email protected]> * Add ostream formatters for TargetPtr/TargetVal. (#5592) * Pattern Language, Matcher, Rewriter, and Function Paritioner (#5231) * [Reduction] Fix cross thread redunction (#5551) - The predictions were not correctly applied after transformation. This leads to normal reduction itervar appearing outside of the loop, which is undefined. See detailed comments. Signed-off-by: Wei Pan <[email protected]> * Fix TVMArray layout on device (#5599) * [LLVM] Represent alignment information in LLVM IR (#5598) * Add debug mode to tempdir() (#5581) * [PYTORCH]ImplicitTensorToNum support added (#5603) * [PYTORCH]Matmul fix for batch_matmul (#5604) * fix rpc server bug on VTA (#5607) * [REFACTOR][IR] Streamline ir/op Registry (#5609) * [REFACTOR][IR] Streamline ir/op Registry This PR refactors the attrregistry mechanism in the ir/op into a separate common base. The common base will provide a foundation for other attr related registries such as target and pass. We also streamlines the terminology of the registry API. - Use AttrMap for the column maps returned by the registry - Use RegEntry to refer to the registry entry. * Address review comments * [TFLITE]GATHER_ND (#5508) Signed-off-by: Dhruva Ray <[email protected]> * [CUDA] Fix codegen for warp shuffle intrinsics (#5606) * fix shfl intrin * improve test_lower_warp_memory_cuda_half_a_warp * Fix a typo. (#5611) Co-authored-by: Zeng Liyong <[email protected]> * fix pattern topological order (#5612) * [BYOC] Remove kCompiler attr from external functions (#5615) Functions destined for external codegen keep their kCompiler attribute which means SkipFunction returns true when running a pass over such functions during the codegen step. This makes sense during graph partitioning, however when lowering the functions for codegen the is no reason to keep this behaviour. Allowing this behaviour will mean a codegen can run a pass on functions only intended for one 3rd party library. Specifically, allowing pre-processing of a series of sub-graphs right before it is passes through codegen. This helps ensure that the functions destined for the 3rd party library are in the expected format. For example, we may want to ensure that these functions have a kernel layout of OHWI because the 3rd party library only supports OHWI. This wouldn't be possible before partitioning the graph as we don't know how the graph will be partitioned ahead of time. Change-Id: Ia68b9da335ef1acfc405a8528aac823de60a65c2 * [Relay]Improve Shape Func handling for Tuple inputs (#5467) * Improve Shape Func handling for Tuple inputs * Fix lint * Improve * Fix build * [Relay][Refactor][std::string --> String] Relay updated with String (#5578) * [KERAS]Global MaxPool3d and AvgPool3d support (#5098) * [IOS] Fix build error of iOS RPC (#5621) * [IOS] Fix build error of iOS RPC - Update to C++14 - Use the latest RPC protocol - Resolve CoreML dependency * Fix clang-format error * Fix three typos (#5620) Co-authored-by: Zeng Liyong <[email protected]> * [Frontend][Tensorflow] Gather nd bug fix for one dim support in tensorflow (#5588) * [Frontend][Tensorflow] Gather_nd one dim support added * Test case added * Doc error handled * Review comment handled: reverting new attr introduced * Check added at mxnet frontend * Doc error handled * TFLite test case failure resolved * [MXNET]MaxPool3d and AvgPool3d Ops support added (#5614) * [PYTORCH]ReflectionPad2d op (#5624) * [BYOC][MergeComposite] if root->args[i] isn't a CallNode, then Donwcast<Call> will check fail (#5623) we needn't execute L131 "call_map->Set(arg, new_arg)", because when arg is CallNode and root->args[i] is not CallNode, new_arg will be a null pointer. There is no point in caching null pointer. Signed-off-by: windclarion <[email protected]> * [DOCS] Move the api docs to the api subfolder (#5626) * [DOCS] Move the api docs to the api subfolder * Update numpydoc location * Ignore 403 * make sure folder exists * [RELAY][BYOC] Fix the creation of tuple of tuples in PartitionGraph (#5616) * [RELAY][BYOC] Fix the creation of tuple of tuples in PartitionGraph If the annotated compiler region contains multiple outputs where some of the outputs are tuple output, the current PartitionGraph will create tuple of tuples. This will not be handled by the runtime. This commit flattens the such tuples and re-create them after the call site of the partitioned function. Change-Id: I4e7ccbda73c129a9f4ae8705d5c9f2af6ab99ef6 * [RELAY][BYOC] Fix the creation of tuple of tuples in PartitionGraph *code refactor : extracted the passes as a sequential Change-Id: If4bc00b00a96fa244358d602fc1a361498342f46 * [RELAY][BYOC] Fix the creation of tuple of tuples in PartitionGraph *further refactor Change-Id: I69ddd0e835e88ef97da8a3a3b949be3f7b619c02 * [RELAY][BYOC] Fix the creation of tuple of tuples in PartitionGraph *class description comment amended Change-Id: I55720bf0467c96e979e1ab56c40d9d209e0f9456 * [NODE][PASS] Introduce config to PassContext. (#5631) This PR introduces a new config field to the PassContext to allow it store arbitary config values. To make sure that the config is validated, we allow each pass to register the config key they would expect and the corresponding types. We also introduce a CreateObject from Map<str, Object> to allow config creation from a json-nest(like in vscode) in python. We added an example of UnrollLoopConfig. Followup PR should migrate the passes to use the new config field. * another cmake fix (#5630) * Fix typo in test script (#5635) * Label Pattern Partitions (#5627) * Label Pattern Partitions with a default label to prevent nested partitions and an optional user supplied-label * Add node names in topological order to Partitioned attribute * respond to review comments * move partition tag into const in attr namespace * [RELAY][PYTORCH]Resize3d, Upsample3d op support (#5633) * [TUTORIAL]TFLite QNN Tutorial (#5595) * [TUTORIAL]TFLite QNN Tutorial * Review comments * Extend AttrPattern to support CallNode and FunctionNode attributes (#5637) * Extend AttrPattern to support CallNode and FunctionNode attributes * Update tutorial and add breaks * add func attr test * [DOCS] Fix the QNN TFLite tutorial build (#5641) * [TUTORIAL] Fix execution error of TFLite quantized tutorial * Assign TensorCore to docs build * [RUNTIME][VULKAN] Seg fault in WorkspacePool's destructor (#5632) (#5636) * [RUNTIME][VULKAN] Seg fault in WorkspacePool's destructor (#5632) * fixed this issue by changing WorkspacePool's destruction order * make line < 100 charactors long * [PYTORCH]Padding support (#5638) * Remove unnecessary print (#5642) * [CI] Allow CI_PYTEST_ADD_OPTIONS to be unbound. (#5644) This patch allows the test script to execute normally when CI_PYTEST_ADD_OPTIONS is not available. * [Runtime] Introduce runtime::Array (#5585) * Introduce runtime::Array * Sync with dmlc-core * Tests added: size, capacity, empty, front, back, push_back, pop_back, insert * 2, erase * 2, resize, reserve, clear * [CI] Add log check to the sphinx gallery docs (#5643) * [CI] Add log check to the sphinx gallery docs This PR add log check to sphinx gallery tutorials to prevent the case when sphinx failed to capture the error in tutorials. * Fix the status * [RELAY][BYOC] Preserve type information in Merge Composite (#5640) Keep the type information when extracting patterns so that it can be used as part of 'check' functions. Change-Id: I16cc70c3d013a794d2ceefb5bec815129c7b8825 * Add a check Callback to the Pattern Paritioner (#5646) * add a check callback to the paritioner * fix doc string * fix unit test spelling * add a test with types * [Relay, Topi][OP] Correlation (#5628) * [Relay,Topi] Correlation * fix * move * typo * Update test_topi_correlation.py * HG: Commit message of changeset 6281661. (#5622) [Relay] Move compiler_begin/end_op to local static objects * [AutoTVM] Update XGBoost verbosity option (#5649) * [RUNTIME] Resolve constexpr issue in debug mode. (#5651) static constexpr is a bit weird before c++17. They are not inlined by default and does not have symbols after compilation. It usually isn't a problem when they are inlined(in c++17 they are inlined by default). But will create compilation error when passed to functions that take (const)references. This PR fixes the problem so that we can compile on debugmode. * µtvm debug improvements (#5648) * Forever loop in UTVMDone to aid debugging * Use parameter and callback function as a micro debug hook. * Previously, users had to uncomment a region of code in micro_session.cc and recompile to debug. Now they can pass in a key in the micro.Session config: config = tvm.micro.device....generate_config() config['debug_func'] = _python_launch_gdb with micro.Session(config) as sess: .... * clang-format * Only forever loop on device (on host this blocks unittests) * [REFACTOR][IR] Migrate IRModule ObjectRef to not-null (#5654) * Upgrade XGBoost to latest (#5658) * Increase bss section size. (#5660) * Likely broken in PR 5590. * [PatternLang] Convert PatternGrouper to do pre-order, non-recursive analysis (#5653) * make the PatternGrouper iterate over the input Expr in a non-recursive pre-order fasion * add a comment * [Relay,Topi][OP] affine_grid and grid_sample (#5657) * [Relay,Topi][OP] affine_grid and grid_sample * lint * [TIR][BUILD] Remove buffer params from pass config. (#5652) Buffer configurations can be passed during construction and does not need to be part of the build config. This is a refactor step to simplify the BuildConfig for the PassContext migration. * handle likely in IRMutatorWithAnalyzer (#5665) * [TOPI] Improve CUDA softmax scheduling (#5600) - Do not use multiple kernels - Schedule with warp reductions - Fixed a bug on the lower warp memory pass - Fixed warp shuffle intrinsics for the nvptx backend. Signed-off-by: Wei Pan <[email protected]> * [Relay][Op]Support symbolic TopK, Ones, Zeros and Full (#5459) * Support symbolic TopK, Ones, Zeros and Full * Fix pylint * Add docstring for topk shape func * Fix grad * Fix lazy_gradient_init * Fix parser * Fix print ir text * Fix lint * Improve pattern_util * Fix topk * Fix build * Use Optional for attribute * Fix clang-format * Minot fix * Fix pylint * Fix build warning * Fix parser * Move ToScalar * Fix lint * Fix lint * Make topk shape func as data independent when k is constant. * Fix lint * Minor fix * [PYTHON] Add buffer name when creating tensor bindings (#5670) * [REFACTOR][TIR][API-Change] Migrate BuildConfig to PassContext. (#5668) * [REFACTOR][TIR] Migrate BuildConfig to PassContext. This PR migrates the TIR configurations from BuildConfig to the PassContext used by the unified IR. Moving forward, PassContext will be the unified way to configure passes in the TVM stack. Changes - Refactored TVM_PASS_REGISTER_CONFIG_OPTION to take in the reference type. - Removed BuildConfig. - Migrated the passes to use PassContext. * Update include/tvm/ir/attrs.h Co-authored-by: Zhi <[email protected]> Co-authored-by: Zhi <[email protected]> * [Doc] Misc doc fix (#5672) * [C++ RPC] Fix C++ RPC build problem on Linux (#5671) * enable amd_apu device on vulkan target (#5659) * [AutoTVM][TOPI] AutoTVM incorrect measurement (#5511) * [AutoTVM][TOPI] AutoTVM incorrect measurement * create new placeholder with converted layout * update _schedule_winograd * [POC][PatternLang]Remove constants from partitioned functions (#5663) * remove constants from partitioned functions * remove print statements * [TF] Support TupleWrapper as direct ancestor of control flow ops (#5639) * add tvm.micro pydoc to sphinx (#5661) * add tvm.micro pydoc to sphinx * making build pass and addressing tqchen comments * add a check for null function attributes (#5674) * [BYOC] Pattern Language MergeComposite (#5656) * Pattern Language MergeComposite * fix DNNL pattern * Use builtin binary operator syntax for demo * Improve unit test * add a testcase for #5674 (#5677) * Call previous excepthook in tvm_excepthook. (#5675) * Call previous excepthook in tvm_excepthook. * Rename prev_excepthook. * Create a tvm_wrap_excepthook to wrap a given excepthook with tvm custom excepthook work and call it on system previous excepthook. * Add docstring. * Fix the shift column for scale_shift_nchw and scale_shift_nhwc in C topi (#5679) * [Bugfix] Fix Python debugger segfaults with TVM built with LLVM (#5685) * Import readline before loading libtvm * make lint happy * [DOC] Improve Pattern Language Docs (#5676) * [DOC] Improve Pattern Language Docs * address comments * address comments * [TFLITE]Quantize & Dequantize op (#5394) * [TFLITE]Quantize & Dequantize op * Testcases added * Review comment fixed * [TIR][REFACTOR] std::string -> String Migration in TIR nodes (#5596) * [TIR][REFACTOR] std::string -> String Migration for Var node and SizeVar Node * update json_compact.py * [PatternLang] Add ConstantPattern (#5689) * Add ConstantPattern * update doc * [PYTORCH]Minor bug fixes (#5683) * [PYTORCH]Minor bug fixes * Review comment fix, testcase added * Added testcase for bert model * [Relay] Fix dataflow_pattern.rewrite() hang if Match in IR (#5680) rewrite() quits only if graph stop changing, but ExprMutator always creates new Match node. This patch fixes this. * [RELAY] Fix segfault in pretty print when ObjectRef is null (#5681) * [RELAY] Fix segfault in pretty print when ObjectRef is null Encountered when pretty printing module with function attribute equal to NullValue<ObjectRef>(). Change-Id: I2e7b304859f03038730ba9c3b9db41ebd3e1fbb5 * Add test case Change-Id: I579b20da3f5d49054823392be80aaf78a055f596 * [REFACTOR][RELAY] move fallback_device to config (#5690) * @zhiics -> PPMC (#5692) * [COMMUNITY] @masahi -> PPMC (#5691) * Support more dtypes for TVMDSOOp (#5694) * [ONNX]LpPool Support added (#5696) * In memory_plan, check if value is not None, instead of just checking value as boolean. (#5700) * [PatternLang]Conditionally Embedding Constants in Partitioned Functions (#5693) * Embed constants in the partition function if the pattern explicity requests constants fix rst fix pylint * improve comments based on Cody's feedback * [ONNX] Skip ADD inside Gemm op when vector is zero (#5697) * [BYOC] Support Tuple Output in C/DNNL Codegen (#5701) * Support tuple output runtime * fix unit test * [REFACTOR][RELAY] Replace build_config with PassContext (#5698) * [PYTORCH]floor_divide support for squeezenet (#5702) https://github.com/apache/incubator-tvm/issues/5133#issuecomment-636330705 * [AutoTVM][TOPI] Fix bifrost spatial packing conv2d auto tune (#5684) * [AutoTVM][TOPI] Fix bifrost spatial packing conv2d auto tune * [AutoTVM][TOPI] Putting placeholder replacement in compute * Fix winograd kernel replacement * Fix sanity check: Line too long * [Arith] ExtendedEuclidean merge impl to int_operator (#5625) * fix typo: anchor windoes should be anchor windows (#5706) * [REFACTOR][PY] relay.op.Op -> tvm.ir.Op (#5705) * [REFACTOR][PY] relay.op.Op -> tvm.ir.Op * Improve the error check * [PatternLang] Simplify Pattern API Implementations (#5703) * Add syntatic sugar; include pattern to API docs * fix doc warnings * [PYTORCH]ReplicationPad support added (#5708) * Remove deprecated opengl files (#5711) * Remove opengl runtime and cmake (#5712) * [BUGFIX][CRT] Fix Compilation Error in CRT (#5713) * Rename tvm_dso_op to libtvm_dso_op (#5714) * [Object] Unify StrMapNode and MapNode (#5687) * Pass cpptest and py unittest * fix graph runtime * right fix * fix a bug that runtime::String's operator < is actually compare by address * Update container.py * Renaming * Address comments * lint * Replace ObjectHash in object.py * [MXNET]Softmin, trunc op support added (#5715) * Avoid downloading when TOPHUB_LOCATION is NONE (#5720) * [Object][FFI] Introduce runtime::String::CanConvertFrom (#5718) * [Object][FFI] Introduce runtime::String::CanConvertFrom * Update container.h * [Object] Restore the StrMap behavior in JSON/SHash/SEqual (#5719) * Fix generating types like float44 and float88 (#5722) * [ONNX]ReduceL1, ReduceL2, ReduceSumSquare, ReduceLogSum ops added (#5721) * [TENSORFLOW]StatefulPartitionedCall/PartitionedCall Ops support added (#5617) * Implemented functionInvocation Unit Test for StatefulPartitionedCall operator(working) and initial changes for placeholder(not working as of now) * Placeholder exercises with tvm * placeholder interim * SPOP Test cases structure * New test cases for spop * miscellaneous test cases for spop * Placeholder samples..working with shapes explicitly passed * Variables test case. Works with the same fix of shape_dict * SPOP Positive test cases first iteration * support output tensors as function args, multiple functions * Corrected Indentation * filewritter is only for debug purpose * support variables in function args * First working iteration of positive spop test cases * Removed commented code, simplified code * Code Reorganization- First working iteration of positive spop test cases * corrected variable name after refactor * Code Reorganization- First working iteration of positive spop test cases * move code inside mapped operator function * Removed extra line * support variables in function args * Removed commented code, simplified code * move code inside mapped operator function * Code Reorganization- First working iteration of positive spop test cases # Conflicts: # tests/python/frontend/tensorflow/test_forward.py * Code Reorganization- First working iteration of positive spop test cases * Function invocation more test cases * Simplified & Merged different Function Invocation Test cases * support invocation of nested callables no need to explicitly handle paratitioned and statefulPartitioned condition in convert_operator function * Simplified and Uniform testcases * support invocation of nested callables no need to explicitly handle paratitioned and statefulPartitioned condition in convert_operator function * Simplified and Uniform testcases * removed duplicate and renamed testcase * Negative scenario added for testing operator statefulness. Only Exception to stateful operators are Partitioned & StatefulPartitionedOp which have capability to execute even stateless operators within them * Miscellaneous reorganization changes for spop scenarios * Miscellaneous reorganization changes for spop scenarios * Corrected import of tensorflow modules safely using try except and other code reorganization * Negative scenario for resource variables handled * Documentation update for code * SPOP change in function handling * handle nested subgraph * refactor * get op def compatible with tf 1x & 2x * Fixed liniting issues * added doctsring and few nits * Merged changes for positive test cases and negative test cases * Moved StatefulPartitionedCall test case to the end of the TC list * Fixed some typos and semantics * dmlc-core * dmlc-core * fixes * Addressing Review comments in the PR for SPOP support * Fixed pylint errors * Corrected tensorflow import syntax * Placed the op_def_registry module import outside of for loop * Removed new stateful operators list and combined these operators with missing operators to display as single list. Also removed throwing seperate exception for stateful ops Co-authored-by: Prashant Sail <[email protected]> Co-authored-by: maheshambule <[email protected]> * [AutoTVM, Relay] Clear compile engine after task extraction (#5724) * Fix runtime::String backward compatibility in JSON (#5725) * codegen llvm: move nvptx-specific intrinsic handling into codegen_nvptx (#5726) See discussion in #5600. I'm also throwing in a pointer lifetime fix for the context held by NVPTX because otherwise topi/tests/python/test_topi_softmax.py would sefault for me. With the test, I can also run resnet-18 on the nvptx target in gpu_imagenet_bench.py. * [TOPI,RELAY][TFLITE] Sparse to dense operator (#5447) * [Relay][Frontend][TFLite] Add parser support for shape and range Signed-off-by: Dhruva Ray <[email protected]> * [TOPI,RELAY][TFLITE] Sparse to dense operator Signed-off-by: Dhruva Ray <[email protected]> * use param name in documentation Signed-off-by: Dhruva Ray <[email protected]> * sphinx doc errors fixed Signed-off-by: Dhruva Ray <[email protected]> * incorporated review comments Signed-off-by: Dhruva Ray <[email protected]> * Missing a blank line... Signed-off-by: Dhruva Ray <[email protected]> * use get_tensor_expr Signed-off-by: Dhruva Ray <[email protected]> * Accidently removed this function in the rebase... Signed-off-by: Dhruva Ray <[email protected]> * support default value for default_value Signed-off-by: Dhruva Ray <[email protected]> * clang format fixes Signed-off-by: Dhruva Ray <[email protected]> * topi pylint fixes Signed-off-by: Dhruva Ray <[email protected]> * [Frontend][TFLite] Add parser support for shape and range (#5329) * [Relay][Frontend][TFLite] Add parser support for shape and range Signed-off-by: Dhruva Ray <[email protected]> * Incorporated review comments and used new functions Signed-off-by: Dhruva Ray <[email protected]> * Few cosmetic changes Signed-off-by: Dhruva Ray <[email protected]> * Removed an extra line added by rebase... Signed-off-by: Dhruva Ray <[email protected]> * [REFACTOR] Separate ArgTypeCode from DLDataTypeCode (#5730) We use a single enum(TypeCode) to represent ArgTypeCode and DLDataTypeCode. However, as we start to expand more data types, it is clear that argument type code(in the FFI convention) and data type code needs to evolve separately. So that we can add first class for data types without having changing the FFI ABI. This PR makes the distinction clear and refactored the code to separate the two. - [PY] Separate ArgTypeCode from DataTypeCode - [WEB] Separate ArgTypeCode from DataTypeCode - [JAVA] Separate ArgTypeCode from DataTypeCode * [ONNX]MaxRoiPool, Mod & Xor op support added (#5729) * ROCm: Add warp shuffles and enable reductions (#5727) Thank you @masahi and @wpan11nv for the feedback * Change 'delete's in Relay VM Instruction dtor to 'delete[]'s (#5735) * Fix reshape usage in ARM Winograd (#5732) * [TEST] Fix flaky topi/tests/python/test_topi_pooling.py:test_adaptive_pool (#5736) * Fix the values for test_fmod since it fails way too often otherwise (#5723) * fix small bug about dense_grad (#5695) * [REFACTOR][ARITH] Remove legacy compute_expr.h (#5738) Replaces most of the ComptuteReduce using foldl. * Add some docs on downstream consistency (#5742) https://github.com/apache/incubator-tvm/pull/5730#issuecomment-639567636 * sequential cpp test (#5745) * [REFACTOR][TE][TIR] Call::Halide => ProducerLoad, DSL/TIR decouple. (#5743) In the HalideIR's design, DSL components and IR are mixed together. For example, Call::Halide can containa reference to a function which is constructed in the tensor expression language. While this coupled design simplifies certain aspect of the DSL construction, it prevents the TIR to evolve as a clean standalone IR: - The additional tensor expression provided in the function is opaque to the IR and may become obsolete as we transform them. - The duplication of the information in the DSL tensor and IR makes it hard to design a stand-alone text format (when there are elements shared in the tensor expression and normal statements). This PR aims to clearly de-couple the TIR from high-level DSL structures(tensor expression), while still provide clear extensions to build DSLs on top of the TIR. We introduce a DataProducer as a base class for high level tensor expressions objects that produce data. We then introduce ProducerLoad to replace the Call::Halide usage, so that the Call node can always be self contained and used for low-level calls. The high-level tensor expression DSL can still generate a PrimExpr that contains a ProducerLoad. These PrimExprs contains fragments of information that can be combined together to generate a low-level TIR PrimFunc. We also state clearly that DataProducer **should not** appear in any TIR PrimFunc. Instead, the high-level DSL layer should lowered DataProducers to Buffers and TIR statements that produces these buffers. We can further provide verifications to validate such invariance. Changes: - Introduce DataProducer to serve as a base class for Tensor in tensor expressions. - Migrate use of Call::Halide to ProducerLoad - Migrate the other usages of Calls. We will also create follow-up PRs to migrate the remaining two DSL related IR nodes(Realize/Provide) to use the DataProducer. * Don't add cast for TF batch norm when type isn't changing (#5731) * [ARITH][BACKPORT-0.6] fix a min/max simplify bug (#5749) * fix a min/max simplify bug * fix cpplint * turn into oposite when c1val<0 and add more case * fix c1=0 Co-authored-by: xqdan <[email protected]> * [TOPI][Relay][OP] support dynamic NMS(Non Maximum Suppression), symbolic begin, end, and strides for strided_slice (#4312) * [TOPI][Relay][OP] Dynamic NMS and strided_slice * Incorporate comments * fix nnvm compatibility issues * fix InferCorrectLayout * Minor fix * fix for fuse * Workaround to pass batch_size into hybrid function to handle dynamic shape * Seperate rearrange * fix lint * fix ci, comments * change attr to Optional<T> * clang format * remove empty lines * partial ignore for end of strided_slice * pylint * add out_indices for gpu get_valid_counts * change to slice_mode * clang-format, fix comments * fix comment * change slice_mode to string * fix CI * update docstring Co-authored-by: Yao Wang <[email protected]> * Update dmlc_tvm_commit_id.txt * Update TRT Integration to reflect upstream changes * Sync submodules * Fix jenkinsfile * git-clang-format against origin/dev instead of origin/master * Fix formatting. * Remove is_empty in export_lib (used for old trt) * Disable test_forward_qnn_mobilenet_v2_net * Add Scatter to Topi/Relay/ONNX via hybrid script (#5619) * I can construct scatter but not embed it in a Relay Graph * working 1-4 dimesion scatter * add scatter to ONNX fix lint * isolate tests to cpu backend * Fix i386 test * fix gpu tolerance * use elemwise_shape_func for scatter * fix incorrect rebase * [Minor][Test] Clean WASM environment before build (#5759) * [Bugfix] Fix reshape (#5739) * Fix reshape * fix doc warning * fix ci * address comments * [REFACTOR][TIR] Provide->ProducerStore, Realize->ProducerRealize. (#5750) This PR finishes up the final step for DSL/TIR de-coupling to refactor Provide/Realize to use the DataProducer. As in the case of ProducerLoad, ProducerStore/Realize are not supposed to appear in a vaid TIR function ans are only used by high-level DSLs as intermediate structures. * [Rust] Second stage of Rust Refactor (#5527) * Add tvm-rt crate * Backport changes from frontend branch * Format * Add ASF headers * Address self-code review * Replace with helper * Fix lint * Fix * Clean up repro debugging * WIP * Remove global resgistry to fix one memory issue * Fix * Format * Format * Update rust/tvm-rt/README.md Co-authored-by: Jason Knight <[email protected]> * Format * Duplicate TVM macros * Split macros * Restore old macro for old crates * Repair macros * Fix format * Format Co-authored-by: Jason Knight <[email protected]> * [topi] block sparse dense on cuda (#5746) * [Relay] Fix for recursive let (#5757) * Make let processing iterative * Try again * Fix pretty printer overflow * cleanup * fix lint * Fix text printer Co-authored-by: Jared Roesch <[email protected]> Co-authored-by: Jared Roesch <[email protected]> * [TOPI][RELAY][PYTORCH]Conv3d_transpose op support added (#5737) * [TOPI][RELAY][PYTORCH]Conv3d_transpose op support added * Test cases in topi/relay * conv3d_transpose_ncdhw_python added * Review comments fixed * Fix gelu in PyTorch frontend, tighten numerical checks (#5763) Previously, the PyTorch frontend approximated gelu with fastgelu. To provide a more faithful conversion, we implement gelu instead. We also tighten the numerical comparisons between PyTorch and TVM-from-PyTorch to 1e-5. The object detection models need an increased tolerance of 1e-4 to pass. I had to throw in a few fixes for missing conversions (probably due to working with very new PyTorch). I must admit the GoogLeNet/NasNet test didn't run on my machine, probably due to problems at my end. * Add ShapePattern and DataTypePattern (#5760) * Make batch matrix multiplication on GPU tunable (#5752) This is primarily aimed at the AMD GPU backend and done as part of a project for AMD, but should work for all users of the GPU schedule. * [TIR][REFACTOR][API-Change] Migrate the tvm/tir/expr.h to construct style. (#5773) This PR migrate tvm/tir/expr.h to the new constructor style that is consistent with the rest of the codebase and changes the affected files accordingly. * [TIR][REFACTOR][API-Change] Migrate tir/stmt.h to use constructor. (#5778) This PR migrate tvm/tir/stmt.h to the new constructor style that is consistent with the rest of the codebase and changes the affected files accordingly. * [Frontend][TensorFlow] Improve Control Flow and TensorArray (#5699) * Improve TF parser control flow and tensor array * Fix tf tensor array scatter * Add ssd test * Add back static ta test * Minor fix for frontend and test_forward * SplitRel for dynamic shape * Fix test ssd * Fix loop var naming issue * Minor improve * Fix format * Fix clang format * Fix tensor array in pytorch frontend * Fix stack size issue for ssd test * Address comments * Fix slice size * Fix build * Rebase * [DOC][FIX] Fix some typos in git-clang-format.sh (#5786) * fix #5686: remove a overstrict assert in MakeAllreduce (#5686) (#5785) * [RUNTIME] Add compile_shared option to linux compile utility fn (#5751) * feat: Add compile_shared option to linux compile fn * feat: Add compile_shared option for linux compile util fn * fix: Fix minrpc testcase use executable compilation * fix: Fix binutil case where call create_shared to create executable Co-authored-by: baoxinqi <[email protected]> * [REFACTOR][API-Change] Migrate all Object construction to constructor. (#5784) This PR migrates all the remaining object constructions to the new constructor style that is consistent with the rest of the codebase and changes the affected files accordingly. Other changes: - ThreadScope::make -> ThreadScope::Create - StorageScope::make -> StorageScope::Create * [Topi] pass-by-value -> pass-by-const-reference (#5783) * [topi][relay] Add operation gather to relay. (#5716) * [CODEGEN][CONTRIB] CoreML codegen (#5634) * [CODEGEN][CONTRIB] CoreML codegen * import coremltools only when it is necessary * fix pylint errors * don't import contrib.coreml when using runtime lib * skip coreml codegen test in CI * don't register relay.ext.coremlcompiler in __init__.py * move tvm/contrib/coreml.py to tvm/contrib/target/coreml.py * use existing transformers for graph partitioning * skip test only when coremltools is not available * add check for annotation * move _register_coreml_op to python/tvm/relay/op/contrib/coreml.py * skip compile when xcode is unavailable * relay.op.Op -> tvm.ir.Op * set USE_COREML on * refine test * fix calibration pass to support multiple functions (#5768) Co-authored-by: Ubuntu <[email protected]> * [cmake] update vulkan rules (#5777) * Add ignore storage_order attribute to onnx pooling parser. (#5781) * [BYOC][FIX] Infer types in MergeComposite (#5766) If InferType isn't run between partitioning passes, function calls are inserted which don't have a type. This can result in failures for patterns which want to check types. This works around it simply by running InferType after every partitioning. Change-Id: Ie0887f0564a41eb0913bfe42a362e8effe9681b9 * [FRONTEND]Darknet support batch size for yolo (#5688) Fix the issue reported in https://discuss.tvm.ai/t/yolov3-tiny-batch-input-test-failed/6796 * Update dmlc_tvm_commid_id.txt * Skip tflite test_forward_mediapipe_hand_landmark * Increase stack limit for failing tflite tests. Skip TF tests which require TF 1.x * [PYTORCH]aten::norm support added (#5776) * [TENSORFLOW]Conv3d Transpose OP added (#5775) * [TENSORFLOW]Conv3d Transpose OP added * Testcase updated, tf cpu supports only ndhwc * [TF] Support symbolic inputs of Fill (#5762) * [TF] Support symbolic inputs of Fill * Rebase and simplify. Value has been converted to constant if it is tf.Constant * [COMMUNITY] @wpan11nv -> Reviewer (#5790) * Edit onnx parser to infer values in post order (#5755) * edit onnx parser to infer values in post order to speed up onnx imports with many calls to infer_value * fix pylint * [TIR][REFACTOR] Cleanup unused classes (#5789) * Fix tf parser (#5794) * support aten::type_as in the pytorch frontend (#5787) * support aten::type_as in the pytorch frontend * use _convert_data_type to convert torch type to tvm type and add more types in the type_as test * [TIR][REFACTIR] Update TIR nodes std::string->String. (#5793) This PR updates the remaining TIR node's member to use String instead of std::string. * [TEST] Temporary disable fp16 type_as test for PyTorch Frontend (#5799) * [ONNX] Skip multiply with 1.0f constant for GEMM import (#5800) * [ONNX] Skip ADD inside Gemm op when vector is zero * [ONNX] Skip multiply with 1.0f constant for GEMM import * [TIR][REFACTOR] Add tir prefix to type keys (#5802) * [QUANTIZE] Add config switch for nn.dense layer type. (#5801) * [topi] fix sparse dense schedule on cuda (#5803) * Allow RPCWrappedFunc to rewrite runtime::String as std::string (#5796) * [topi] fix strategy for sparse dense cuda (#5782) * [CI] Move cpu-only frontend tests to a CPU stage (#5807) * [MXNET]conv3d and conv3d_transpose addedx (#5814) * Pin hand landmark network to version 0.7.4. (#5813) * Versions above 0.7.4 are broken due to changes in the quantization operations in the model, which are current not supported by TVM. Fixes #5774. * [CI] Limit number of threads in all jobs (#5815) * Update dmlc_tvm_commit_id.txt * Disable tensorflow.test_forward_sdd because stack limit of 100mb is exceeded by WellFormedChecker Co-authored-by: Samuel <[email protected]> Co-authored-by: ANSHUMAN TRIPATHY <[email protected]> Co-authored-by: wsl-inspur <[email protected]> Co-authored-by: Krzysztof Parzyszek <[email protected]> Co-authored-by: Matthew Brookhart <[email protected]> Co-authored-by: Mahesh Ambule <[email protected]> Co-authored-by: Tianqi Chen <[email protected]> Co-authored-by: Animesh Jain <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Thierry Moreau <[email protected]> Co-authored-by: tobe <[email protected]> Co-authored-by: Jared Roesch <[email protected]> Co-authored-by: Nick Hynes <[email protected]> Co-authored-by: Tang, Shizhi <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Wei Pan <[email protected]> Co-authored-by: Tom Gall <[email protected]> Co-authored-by: MORITA Kazutaka <[email protected]> Co-authored-by: masahi <[email protected]> Co-authored-by: Haichen Shen <[email protected]> Co-authored-by: Ramana Radhakrishnan <[email protected]> Co-authored-by: Menooker <[email protected]> Co-authored-by: Josh Fromm <[email protected]> Co-authored-by: lixiaoquan <[email protected]> Co-authored-by: Li Xiaoquan <[email protected]> Co-authored-by: Candy <[email protected]> Co-authored-by: LiangLiu <[email protected]> Co-authored-by: lhutton1 <[email protected]> Co-authored-by: Giuseppe Rossini <[email protected]> Co-authored-by: Andrew Reusch <[email protected]> Co-authored-by: Liangfu Chen <[email protected]> Co-authored-by: Michal Piszczek <[email protected]> Co-authored-by: Zhi Chen <[email protected]> Co-authored-by: Zhi <[email protected]> Co-authored-by: Dhruva Ray <[email protected]> Co-authored-by: Liyong Zeng <[email protected]> Co-authored-by: Zeng Liyong <[email protected]> Co-authored-by: Yao Wang <[email protected]> Co-authored-by: windclarion <[email protected]> Co-authored-by: manupa-arm <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Yi Wang <[email protected]> Co-authored-by: Cody Yu <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: mbaret <[email protected]> Co-authored-by: hlu1 <[email protected]> Co-authored-by: Philip Hyunsu Cho <[email protected]> Co-authored-by: Zhao Wu <[email protected]> Co-authored-by: Mei Ye <[email protected]> Co-authored-by: Neo Chien <[email protected]> Co-authored-by: notoraptor <[email protected]> Co-authored-by: Balint Cristian <[email protected]> Co-authored-by: Rand Xie <[email protected]> Co-authored-by: abergeron <[email protected]> Co-authored-by: Deepak <[email protected]> Co-authored-by: Prashant Sail <[email protected]> Co-authored-by: maheshambule <[email protected]> Co-authored-by: Thomas Viehmann <[email protected]> Co-authored-by: akosik-anyvision <[email protected]> Co-authored-by: handar423 <[email protected]> Co-authored-by: xqdan <[email protected]> Co-authored-by: xqdan <[email protected]> Co-authored-by: Yong Wu <[email protected]> Co-authored-by: Jason Knight <[email protected]> Co-authored-by: Zijing Gu <[email protected]> Co-authored-by: Jared Roesch <[email protected]> Co-authored-by: majiang31312 <[email protected]> Co-authored-by: wrongtest <[email protected]> Co-authored-by: baoxinqi <[email protected]> Co-authored-by: Yi-Hsiang (Sean) Lai <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Bing Xu <[email protected]> Co-authored-by: Leandro Nunes <[email protected]>

…5231)

The structure is similar to the Relay's pattern matcher (apache/tvm#5231). The main difference is that those pattern types are adopted to be relax-compatible. Relay pattern types, some less used patterns (IfPattern) and df-topological patterns (DominatorPattern) are ignored (some of them will be brought later). The implementation splits patterns into two parts: - **Match an Expression**: match an expression syntactically (`MatchExprPattern`, i.e., `DFPatternMatcher`); - **Match a Graph**: match a graph (cross multiple `VarBinding`) topologically (`MatchGraphPattern`);

commit 5bf9c8acf12dfba9865ac9f8480341298131dec4 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 16:10:16 2023 +0900 clean up commit 5506d92ed9a4c48c63f192ddcb576c9665d4ad5b Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 15:39:39 2023 +0900 link and run compiled cutlass code, result correct commit 81d39f84ebb1a7bcfe5c2fa9f97ce2130f932dbb Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 15:13:41 2023 +0900 compile generated cutlass code commit c2a68e14575c2711497347d5fc93d15b88c6c79b Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 07:47:31 2023 +0900 codegen working commit ba26344f85ebe43f88852c8c18b754bf03df1ce1 Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 19:41:47 2023 +0900 wip commit ed3ac6d632a4798e411573f30d1a090bc05a96fc Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:53:10 2023 +0900 wip commit 47e09e54a0d405a14a602d7a6d31c49399c5662f Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:32:58 2023 +0900 wip commit b9e5df768b188de3dda1ef0d0f3db3fd592535d9 Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:25:37 2023 +0900 copy codegen_c base function commit fe20e653ecf548f07432f06cd17395b554e6faa5 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:43:57 2023 +0900 add cutlass stub commit 990eec78b58ca259bc067bb32e4020f28d88b7c8 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:18:57 2023 +0900 updated cutlass revision commit 591a8f1ba62d9f8e923f2dcc1702e7e7590e92e2 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:02:01 2023 +0900 conv2d + relu DNNL offload works commit 1365402079626eab5bf99bad96dbfa4abd750175 Author: Masahiro Masuda <[email protected]> Date: Fri Jan 13 16:35:49 2023 +0900 starting DNNL codegen commit 4a72e7810b0df31a4fb13856b5b6320ced4e978e Author: Masahiro Masuda <[email protected]> Date: Thu Jan 12 14:02:19 2023 +0900 clean up commit 61cc55e94123f3064e0d1200c70f33b4a537c4ad Author: Masahiro Masuda <[email protected]> Date: Tue Jan 10 16:26:31 2023 +0900 pattern based partitioning working commit 2433733c5458302cbe05e534d6c99bec13fb6d36 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 10 08:30:20 2023 +0900 add conv2d match & run test commit 360429440acb7068fdfd982d597523ebe032eb20 Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 17:20:05 2023 -0500 [Op][O2e] Indexing and datatype operators (#338) commit e45bdb73824d120bb3b848d4fdaa54f88211b509 Author: Tianqi Chen <[email protected]> Date: Mon Jan 9 14:59:26 2023 -0500 [VM] Supporting "compiled" exec mode. (#331) * [VM] Supporting "compiled" exec mode. This PR adds support of "compiled" mode to the VM. The compiled mode translate the relax function into TIR function and drive it through the TIR function. It is different from the micro AOT codegen, which generate TIR code that targets the micro C runtime environment and useful for resource limited settings with smaller set of features. Both leverages the low-level TIR build that is also shared with TensorIR. The current implementation targets full TVM (VM) runtime, that comes with PackedFunc, object, tuple, closure and all kinds of rich structure support. This also mean that we can leverage the full runtime support to handle things like allocation, dynamic shape, easy plugins and python interaction, which are not available in more limited runtime. The user directly use the same API to load the generated code regardless of compiled mode or bytecode. And just need to change one line ```python ex = relax.vm.build(mod, target, exec_mode="compiled") ``` Most of the codegen features are lifted before the codegen phase, so the overall implementation would be around 500 loc for each exec mode and can be further cut down with future introduction of PrimValue. The simplicity is thanks to the TVM runtime archiecture that allows us to compose things together in objects. The only difference is how the PackedFunc of high-level driving is being provided. In the case of bytecode it is normal interpretation and in the case of compiled mode it is TIR. It is a complete implementation Unit-testcases are added. All codegen build tests are updated to include two exec_modes and have passed locally. The only exception that we skipped some special packedfunc handling(printing) because can be further simplified after we introduce PrimValue. Co-authored-by: Junru Shao <[email protected]> * Address review comments Co-authored-by: Junru Shao <[email protected]> commit 32c2bf74eda5ff9cb958e6d54a29c324d53f2869 Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 13:45:14 2023 -0500 [Op][O2d] Manipulation operators (#337) As tracked by #332, this PR is the O2d milestone of the high-level operator introduction plan. This PR introduces a few manipulation operators: * broadcast_to * concat * expand_dims * flatten * permute_dims * reshape * split * squeeze These operators are all well-tested. commit b39d11a37c899a1625ecee0ffdacc5ef5444365f Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 10:57:19 2023 -0500 [O2h] Neural network and linear algebra operators (#343) commit 1d6d897ec223cc07768e0382c3e21a196ffdfac8 Author: Ruihang Lai <[email protected]> Date: Sun Jan 8 20:21:50 2023 -0500 [O2g] Convolution, pooling and image operators (#341) commit 95f784ece1d61676b88b5455be3dab5e3ddbc75a Author: Ruihang Lai <[email protected]> Date: Sun Jan 8 16:53:10 2023 -0500 [Op][O2f] Set and searching operators (#339) commit be1c32d817bbbbd56329378d6d929dce79ecb0f8 Author: Siyuan Feng <[email protected]> Date: Mon Jan 9 03:38:20 2023 +0800 simple fix jupyter error reporting (#345) commit da11e4bf373349ce4142949099e29d11655aa88b Author: Siyuan Feng <[email protected]> Date: Sun Jan 8 23:09:22 2023 +0800 [TVMScript] Symbolic shape computing (#342) commit 80808fbf9a02480abf337b8a5edffe34c963feec Author: Ruihang Lai <[email protected]> Date: Sat Jan 7 18:31:00 2023 -0500 [Op][O2c] Creation operators (#336) commit 5efc8f7224f83766875e74669e139ec82119a504 Author: Ruihang Lai <[email protected]> Date: Sat Jan 7 11:14:23 2023 -0500 [TIR] Create Layout with specified axis dtype (apache/tvm#13663) (#340) commit ae71be06c8252c211642abb9d5b3e4583bdb6f6a Author: Ruihang Lai <[email protected]> Date: Fri Jan 6 16:41:18 2023 -0500 [Op][O2b] Statistical operators (#334) commit 8220df74e339cdb6dab38a803b80edc3cd6b92e2 Author: Ruihang Lai <[email protected]> Date: Thu Jan 5 18:31:48 2023 -0500 [Op][O1][O2a] Utility, arithmetic and comparison operators (#333) As tracked by #332, this PR is the kickoff part of high-level operator introduction in Relax. This PR is about the milestone O1 and O2a. Specifically, this PR * introduces some of common utility functions that the registration and StructInfo inference of each operator will often use. * introduces unary arithmetic operators: cos, log, negative, sigmoid, sin, sqrt, tanh. * refactors and introduces binary arithmetic operators: add, divide, floor_divide, multiply, subtract. * introduces binary comparative operators: equal, greater, greater_equal, less, less_equal, not_equal. These operators are well tested from three perspective: P1. the op getter can get correct op by name P2. their StructInfo inference result are as expected under all kinds of cases P3. Relax TVMScript parser can parse the scripts with the op inside For operators in O2a, most operators share almost the same StructInfo inference logic. Therefore, for tests in P2, in each category, not every op is tested in every case. For each case, it is good to have only part of op in this category tested. This is intended not to make overlarge testing file. commit f1cab0a05f05829c4c35e2a7e613bd69f2a17fae Author: Siyuan Feng <[email protected]> Date: Thu Jan 5 20:43:28 2023 +0800 [TVMScript] Ensure consistent struct info between assign lhs and rhs with sinfo annotation (#328) * [TVMScript] Ensure consistent struct info between assign lhs and rhs with sinfo annotation * fix * fix commit dc7072efe290d7e8c69d8e216311510981fc82e1 Author: Tianqi Chen <[email protected]> Date: Wed Jan 4 10:13:08 2023 -0500 [REFACTOR] Hide VM Impl, Improve execution logic. (#326) * [REFACTOR] Hide VM Impl, Improve execution logic. This PR refactors VM by hiding most of the VM implementations and improve the overall execution logic. - Unifies PackedFunc and Closure Table. - Update Closure mechanism to no longer depend on string. - Update VMMemoryLower to VMBuiltinLower to incorporate more VM intrinsic lowering, move some of the codegen intrinsic to this phase. - Allow directly pass in function index as VM instruction. * Address comment commit 2449d8c205f0b6e2c346132695b56039b07e9a10 Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Jan 3 22:04:16 2023 -0500 [IR][ASTPrinter] Tweaks to AST printer's handling of struct info (#330) commit 2d352807090ba1b7e898fbdcb83d6d9427c762cf Author: Siyuan Feng <[email protected]> Date: Tue Jan 3 23:20:47 2023 +0800 [TVMScript] Enforce `I.DeclareFunc` to have function signature (#329) commit dcae50e836a0c2999f52d96a372fc7de584951f4 Author: Tianqi Chen <[email protected]> Date: Mon Jan 2 15:21:49 2023 -0500 [BACKEND] Refactor and introduce full match-cast support. (#324) * [BACKEND] Refactor and introduce full match-cast support. This PR refactors VMShapeLower to introduce full match-cast support that enables nested tuples, type checks at argument boundary and symbolic shape computation. Along the way we also refactors cleans up some of vm codegen logic and adding unit-tests for different stages. * address comments commit a36920bf672d22e1d31e1e6f81d0447fd7a55806 Author: Siyuan Feng <[email protected]> Date: Mon Jan 2 23:31:04 2023 +0800 [TVMScript] Fix empty TupleStructInfo (#327) commit 80710a826bda66532eeda978668ed157b471b186 Author: Tianqi Chen <[email protected]> Date: Fri Dec 30 15:57:50 2022 -0500 [CONTAINER] Hash/Equal/JSON support for ShapeTuple (#325) This PR add hash/equal/json support for shape tuple. commit 343a1e7e2174612031c70ba8547577c7d21839e4 Author: Tianqi Chen <[email protected]> Date: Thu Dec 29 18:33:17 2022 -0500 [REFACTOR] StructInfo M3: MatchShape=>MatchCast (#323) * Introduce match cast, and code changes along * add match_cast parser support (#9) * Match cast support for VMShapeLower CanonicalizeBinding * Remove `match_shape` (#12) * Refactor ExprVisitor/Mutator to consider Expr in StructInfo. Co-authored-by: Siyuan Feng <[email protected]> commit e332285559d61db1c5033b8d50cd9d4af6c6b6f4 Author: Tianqi Chen <[email protected]> Date: Thu Dec 29 01:28:09 2022 -0500 [REFACTOR] StructInfo M2: Cleanups on legacy shape related items (#320) * [REFACTOR] Remove shape function * [WIP] Remove shape_, runtime_dep shape * Remove shape_ pass Compile * Remove RuntimeDepShape (#11) * BlockBuilder: remove CanProveShapeEqual, consolidate binding emit to EmitNormalize * Remove DimType, make get_shape_of API different from op.shape_of Changes the init importing to direct import so the VSCode nagivator can directly jump to the defintion point. * Apply suggestions from code review Co-authored-by: Ruihang Lai <[email protected]> * Clarify cases where struct info can be determinstically derived * Fix remaining testcases * Remove InferShape/Type per comment. Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit edadf247551f526188c0a08b3812ffc0a1f9d8bd Author: Ruihang Lai <[email protected]> Date: Fri Dec 23 14:46:07 2022 -0500 [Analysis] Optionally check structure info in well-formedness check (#321) With the introduction of structure info in #314, the well-formedness check will report malformed whenever an Expr doesn’t have defined structure info. However, when writing tests for well-formedness check and normalizer, usually we will manually construct the Exprs, which means their structure info are not defined most of the time. As a consequence, the well-formedness check will always complain “the Expr xxx doesn’t have structure info populated.” Therefore, when the checker fails to complain about the original reason of malformed, which means the checker is not working, the tests will still pass and we won’t be able to realize there is something wrong with the checker. Thus, in this PR we add an optional flag to the well-formedness check. In well-formedness tests, we will turn off the structure info check so that the original reason of being malformed will be revealed correctly. --- This PR also cleans up the DiagnosticContext parameter in the WellFormed API - the diag_ctx has been unused since the merge of #99. commit d548459a1736378398ab773dce413d90d49376cf Author: Ruihang Lai <[email protected]> Date: Fri Dec 23 07:33:25 2022 -0500 [Op] Enforce int64 output shape in CallTIR (#322) commit 10a87a455bbb84b0a0d20b22bd31784b9f4b9774 Author: Chaosfan <[email protected]> Date: Fri Dec 23 08:03:48 2022 +0800 [Bugfix] Handle function name properly in Relax TVMScript printer (#317) * remove relax_func_name_ and change logic * well_formed check for globalvar and gsymbol consistency * revise the logic in well_formed and update test * Remove `global_symbol` in test_function_attr.py * Update docs Co-authored-by: Ruihang Lai <[email protected]> commit 29aebb9d24cbf52ab21fd98996633534301ef34d Author: Tianqi Chen <[email protected]> Date: Wed Dec 21 20:21:57 2022 -0500 [REFACTOR] M1: Change parser/printer to only depend on struct info (#319) * [REFACTOR] StructInfo M1: Parser/printer/Var/Function to only depend on struct info field * Update src/relax/backend/vm/vm_shape_lower.cc Co-authored-by: Ruihang Lai <[email protected]> * Address comments * Allow function to have default value Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit e6173430f491c1d88d2ab77ce0ab43a8c602df30 Author: Tianqi Chen <[email protected]> Date: Wed Dec 21 00:42:29 2022 -0500 [REFACTOR][ARCH] Introduce StructInfo M0 (#314) * [IR] Introduce StructInfo * StructInfoFunctor and Analysis Support * [TVMScript] Parse type/shape annotation with StructInfo * remove runtime type assign * Remove type/shape during parsing (#2) * Normalizer prep: simple checks and legacy function renaming. * Struct info deduction in BlockBuilder. * Two TODOs * StructInfo Normalizer Fixes (#3) * StructInfo AST Fix * Fix Extern Func Deduction and shape mutator. * Update VoidStructInfo & globalvar (#4) * Fix passes and proper sinfo propagation. * Refactor EraseToWellDefined to Enable Remapping * [WIP] First stab at symbolic param tracking * Update EraseToWellDefined to support symbolic shape return (#5) * fix R.shape with ndim (#6) * Remove update shape/type * Address review comment, AnnotateTypeShape=>AnnotateStructInfo * Update include/tvm/script/ir_builder/relax/frame.h Co-authored-by: Ruihang Lai <[email protected]> * Address comments * Update printer to use structinfo (#7) * Update Error mechanism to prep for obj loc based reporting * Symbolic shape aware function call return value derivation. The main flow works as follows: - Match and populate shape_var_map and var_map by visit each pair of param and call arguments. - Call EraseToWellDefined to map the ret parameter to new result. * [ANALYSIS] Refactor well-form to only look at struct info. * Update comments according to reviews. * Update include/tvm/relax/struct_info.h Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Tianqi Chen <tqchen> Co-authored-by: Ruihang Lai <[email protected]> commit 151701740fac3a53b35799a82c85d86f91b720ee Author: Tianqi Chen <[email protected]> Date: Fri Dec 16 17:48:26 2022 -0500 Update relay_translator.py commit ad0f3179a84b3bc167f91c3eb082cb996b1d04e2 Author: Ruihang Lai <[email protected]> Date: Fri Dec 16 17:37:00 2022 -0500 [Translator] Remove global symbol and follow-up fix for #262 (#316) This PR removes the `global_symbol` linkage added by Relay Translator. It also fixes unaddressed comments of #262. All tests can pass locally and I believe it is safe to merge this PR directly. commit 850deded1201001d833ac65991fb1a4c6509cb1b Author: Ruihang Lai <[email protected]> Date: Fri Dec 16 16:19:48 2022 -0500 [Translator] Support translating op calls with Tuple input (#262) Previously, when a Relay function contains a Call which directly uses Tuples as arguments (the example below), ``` %25 = (%23, %24) /* ty=(Tensor[(1, 160), float32], Tensor[(1, 160), float32]) */; %26 = concatenate(%25, axis=-1) /* ty=Tensor[(1, 320), float32] */; ``` our Relay-translator is unable to generate corresponding CallTIR, because the translator always assumes a argument of a Call is mapped to a single tensor (see the code snippet below: the translator directly passes the Relax variable `new_args[-1]` to function `te_tensors`, which translate a Var to a single tensor). https://github.com/tlc-pack/relax/blob/60e9a01cdfdd013945790fc03d5abad29b8a7c0b/python/tvm/relax/testing/relay_translator.py#L124 https://github.com/tlc-pack/relax/blob/60e9a01cdfdd013945790fc03d5abad29b8a7c0b/src/relax/ir/emit_te.h#L56-L61 But in fact, the Relax variable may correspond to a Tuple of tensors, which wasn’t taken into consideration before. And such case can lead to error in `TETensor`, when creating tensors. Therefore, this PR fixes the issue by examine the Relax variable before the tensor creation of Relay Call arguments. If an argument has shape Tuple and type TupleType, we break down the tuple Variable and emit a TupleGetItem for each field, and meanwhile create a tensor for each field. commit 54a0ff551adb90937073675b4fb3d5439b814398 Author: Siyuan Feng <[email protected]> Date: Fri Dec 16 21:02:13 2022 +0800 Remove relax parser_v1 (#313) commit b363dd48aced8fb939880db8cf595ed65b7ecc77 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Dec 14 22:51:38 2022 -0500 [Debugging][Arch] Expose `shape_` fields for `TupleGetItem` and `If` nodes, fix AST printer accordingly (#311) * Make the shape of If and TupleGetItem nodes accessible in Python * Remove order-dependency from AST printer tests * Trailing whitespace commit 4bb01fe4eccdd59614cc264838a389b21dd40388 Author: Yuchen Jin <[email protected]> Date: Wed Dec 14 08:11:47 2022 -0800 [IR] Dedicated Relax Call, Constant, Tuple, TupleGetItem, If (#306) * relax.Constant. * Add callnode; * Tuple, tuplegetitem, If * mypy. * lint * rebase & fix printer. * rebase & remove virtual_device_ * address comments & leave todos. * address comments. * address comments. * tuple index. * type anno. commit 4cda8a5881fd4cd2473258b35244fc4129b6110c Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Dec 14 09:09:03 2022 -0500 [BlockBuilder][Refactor] Normalize nested `SeqExpr`s (#310) Co-authored-by: Ruihang Lai <[email protected]> commit 5aab150f322526c1a7bfe6cea0f4d7a7543a7f46 Author: Ruihang Lai <[email protected]> Date: Tue Dec 13 17:06:06 2022 -0500 [ExprMutator] No prologue in VisitWithNewScope when input is SeqExpr (#305) commit 0bf1f1b784f19298117e36016a2e522f58c143fc Author: Tianqi Chen <[email protected]> Date: Tue Dec 13 15:27:05 2022 -0500 [REFACTOR] Refactor BlockBuilder (#308) commit 28d598b6a7c55f95f8f9c2ccd5c860ba5451232d Author: Siyuan Feng <[email protected]> Date: Sun Dec 11 01:28:56 2022 +0800 [Normalizer] Combine Nearby Blocks in SeqExprs (#298) commit e152c50e368454afab75425fcb0863b1c328bf4c Author: Tianqi Chen <[email protected]> Date: Thu Dec 8 19:33:18 2022 -0500 [ARCH] Add VisitBinding second-level dispatcher in Expr type. (#301) commit fed6b8fc88b824ec68260417793447dbe524c4c3 Author: Yuchen Jin <[email protected]> Date: Wed Dec 7 16:55:40 2022 -0800 [Linkage] Cleanup global_symbol attachment and linkage. (#300) * Cleanup global_symbol attachment and linkage. * lint * Add global_symbol to the main function in translation. commit e0907d4fd03af1731310647d3d0547bdff2cfaf6 Author: Tianqi Chen <[email protected]> Date: Tue Dec 6 21:35:20 2022 -0500 [ARCH] Introduce NestedMsg to robustly handle nested-tuple analysis (#295) commit 2eb99975dc1b40b83db7dcbb96b748503dcb3319 Author: Siyuan Feng <[email protected]> Date: Mon Dec 5 21:57:21 2022 +0800 [TVMScript] Update sccript printer to enable roundtrip tests (#291) commit f8ab9890e14c2533c401969ebf11dd591beff592 Author: Hongyi Jin <[email protected]> Date: Sun Nov 27 09:59:26 2022 -0500 [RUNTIME] Correctly handling export_module when exporting modules of different type (#13489) commit 9009840e654a9900009f7776a19e26f29b1e3f85 Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Dec 2 18:33:50 2022 -0500 [Debugging] Support PackedFuncType in the AST Printer (#289) commit bda0e42f05eaba657c40a850486e55c39924f3bf Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Dec 2 18:31:39 2022 -0500 [IR][Bugfix] Improvements to the normalizer and well-formed checker (#288) commit d5fe87b21546995c7a88905bd04b4e944d28a0f4 Author: Yong Wu <[email protected]> Date: Thu Dec 1 20:00:38 2022 -0800 Enforce i64 index in ShapeExpr (#281) commit 9c9eb5585501a5da0f25ca38d7d3ac8269b6714c Author: Yuchen Jin <[email protected]> Date: Thu Dec 1 11:00:47 2022 -0800 [Parser] Register memory operators to new parser. (#279) commit 28c3f68cc51d2c22936c5496debcb8c2de54040b Author: Yong Wu <[email protected]> Date: Thu Dec 1 08:55:31 2022 -0800 [TVMScript] enable the closure test (#280) * [TVMScript] enable the closure tests. commit eb9d531b2565cdd000f46e5ecae2c45b9f589abe Author: Yuchen Jin <[email protected]> Date: Thu Dec 1 05:47:05 2022 -0800 [Normalizer] Enforce all Expr have checked_type_ invariance after normalization. (#287) commit 43f81ddf4afc2f4fdb214c9f994e844f53126cdb Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Nov 21 19:25:43 2022 -0500 [Debugging][Bugfix] Debug printer improvements: Print `shape_` and `checked_type_` for all nodes and handle non-binding `MatchShape`s (#261) The initial AST printer only included the `shape_` and `checked_type_` fields for variables because of the potential for infinite recursion (`shape_` nodes can contain other expressions, which in turn have `shape_` nodes). This PR cuts off the potential recursion to allow for printing these fields for all Relax expressions, which should be more useful for debugging. This PR also fixes a bug: The AST printer previously did not handle `MatchShape` bindings that did not bind a new variable. commit 304048c33956dddb5027fec26541d57f903d8ca2 Author: YuchenJin <[email protected]> Date: Thu Nov 17 17:02:11 2022 -0800 Fix after rebase, and reorganize the TVMScript folder structure. Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> commit e7277460f0a2c7c980be9323cdf7919dc38153e2 Author: Siyuan Feng <[email protected]> Date: Thu Nov 17 00:31:32 2022 +0800 [TVMScript] Switch to the new parser (#276) * [TVMScript] Support cross-function call for relax function This PR adds support for cross-function call for relax function, by declaring a function signature (i.e. an empty function that contains params and return type/shape but w/o body.) However, the PR meets the issue of block_builder shape deduction, which does not use function `ret_shape` to infer the shape of GlobalVar Calls. commit 7152175762613130e3ba647c77cc9818312a5b06 Author: Yuchen Jin <[email protected]> Date: Sat Nov 5 16:45:33 2022 -0500 [CI] Enable Mypy type checking for Relax; Fix typing errors to pass Mypy checking. (#270) commit 6f8f6da505b835345d7709d06bdfd8dddce7e85b Author: Lesheng Jin <[email protected]> Date: Thu Nov 3 08:16:35 2022 -0700 Introduce memory primitives (#255) Introduce the memory primitives, including `relax.memory.{alloc_storage, alloc_tensor, kill_storage, kill_tensor}`. commit 48b7c158cc01532f9019a2e615f2d94766a9464c Author: Siyuan Feng <[email protected]> Date: Thu Oct 20 08:30:47 2022 +0800 [TVMScript] Update Type Annotation Behavior of the Parser (#269) This commit changes the behavior of the parser to allow type annotations, as suggested by the community. The current behavior: - Use the more refined type/shape between user annotated and deduced type/shape. The updated behavior: - Always use user annotations - Only checks if the type/shape is valid. commit 5c3079bb6e1e4eeb4dc2d9b740facb2686c67519 Author: sung <[email protected]> Date: Mon Oct 17 19:07:01 2022 -0700 Reenable autotvm silencer; fix e2e_auto_tir.py; fix lint. Co-authored-by: YuchenJin <[email protected]> commit 85b81292626ab6f23caf2b61095a6f957b61b21c Author: sung <[email protected]> Date: Mon Oct 17 18:09:34 2022 -0700 Recover: [Bugfix] Couple of bug fixes to run TVM-gen code together with BYOC (#249) commit c46ae8566582f1fcd8fcda1479943d3abb95b3b0 Author: sung <[email protected]> Date: Mon Oct 17 17:16:01 2022 -0700 Recover: [Pass] Separate ApplyHistoryBest from tuning passes (#226) commit 83bc7cb144643d5823bf06220186528923835667 Author: Junru Shao <[email protected]> Date: Sun Oct 16 22:52:56 2022 -0700 Enable Hexagon tests commit f9f4f7904ec5468a725b2ba924a619a7c5ed4e43 Author: Junru Shao <[email protected]> Date: Sat Oct 15 15:25:56 2022 -0700 Recover dropped commits [TVMScript] B4: If branch support (#263) B8: Local Function Support (#258) [TVMScript] B3: Type annotation checks (#256) [TVMScript][Parser] B1: Dataflow block (#252) [TVMScript] B2: match shape support (#251) [TVMScript] B6/B7: Symbolic shape and var shadowing (#245) [TVMScript] B5: Support relax op (#244) [TVMScript] B0: Call_tir support (#243) enhance parser error reporting (#242) [TVMScript] A1: Relax Parser infra (#240) update ci image versions. (#241) [TVMScript] B2-4: TIR IRBuilder (#239) [TVMScript] A0: Relax IRBuilder infra (#235) [TVMScript] B5-6: TIR IRBuilder (#231) [TVMScript] B1: IRBuilder (#228) [TVMScript] New Parser: Part C (#218) [TVMScript] New Parser: Part A (#221) [TVMScript] New Parser: Part B (#217) Not recovered: [Pass] Separate ApplyHistoryBest from tuning passes (#226) [Bugfix] Couple of bug fixes to run TVM-gen code together with BYOC (#249) co-authored-by: Yuchen Jin <[email protected]> co-authored-by: Siyuan Feng <[email protected]> co-authored-by: Ruihang Lai <[email protected]> commit 65a53034bc0bee9877a1bdf363c2eadcde35f226 Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Oct 13 23:06:55 2022 -0400 [Op][Debugging] Add `assert` operator (#260) It was brought up that Relay lacks an assert operator, so we may as well have one in Relax for debugging. One issue is that we can't name it "`assert`" because Python will treat it as a syntax error to have it as a field name for the "`relax`" module, i.e., `relax.assert` is a syntax error. Thus the op is named "`assert_op`," which is not ideal but serves its purpose. commit 71d96e6c0a314936fa49fd7bc1ea79069027ab12 Author: Yuchen Jin <[email protected]> Date: Wed Oct 12 05:07:33 2022 -0700 [Pass] Support Function and If in Normalize pass. (#268) * Support Function and If in Normalize pass. * Use structural equality for expr_memo_. * Change back to pointer equality for expr_memo_; Add more tests. * rebase. commit 312a344cdeec66b1330a80d34ca78556fb338e7c Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Oct 11 18:25:29 2022 -0400 [Analysis] Expose analyses related to vars in Python (#265) Previously, analyses to gather up all variables, free variables, bound variables, all global variables, and all global variables that are called had been implemented in C++ but had not been exposed in Python or tested. This PR exposes these analyses and adds tests for them. Two further changes: * The analyses previously ignored variables bound in `MatchShape` nodes; these are now treated as bindings too. * `rec_global_vars` is renamed `called_global_vars`, since the analysis itself does not check recursion. commit 132702be7e7ed0256045d7a405e532c3d5beef6d Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Oct 10 18:19:38 2022 -0400 [Expr] Allow annotating return shape on function nodes (#253) This PR adds a `ret_shape` field for specifying the shape of the function's return value. At present, we will not use this information, but by adding it into the AST, we will be able to parse the return shape and use it in the future. Parser V1 in this PR will just always list the `ret_shape` as `RuntimeDepShape`. commit 7276c9e2ee13a4754775491ca36a7aae2d55b827 Author: Steven S. Lyubomirsky <[email protected]> Date: Sat Sep 24 00:11:45 2022 -0400 [Bugfix][VM] Properly convert tensor inputs in `save_function` (#257) It was observed that closures saved using `save_function` would crash when used over RPC with the `time_evaluator`, whereas using `set_input` and `invoke_stateful` worked as normal. While I am not entirely sure why these failures happened over RPC only in `time_evaluator` (but not in other RPC trials), it became clear that `set_input` performs a conversion of input tensor values in `SetInputTensorWithIndex`, while `save_function` was not doing this. Adding this conversion fixed the observed bug. commit 7183c7ffbe896dd9b5f5742b62afe9c821dae682 Author: Josh Fromm <[email protected]> Date: Wed Sep 21 17:07:08 2022 -0700 [Call TIR] Fix bug when invoking call_tir with scalar values. (#254) This small PR changes a check in the tvmscript parser to support empty shape tuples which are used to represent scalars. I added a scalar addition test to make sure it works properly. commit 605ba8d1548efb90980f9b18ea94f1d53f9ec3ec Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Sep 14 17:27:03 2022 -0400 [Bugfix][Op] Register attributes for unique and print (#248) Attempting to use `dump_ast` on functions containing the operators `relax.unique` and `relax.print` previously crashed due to being unable to query their attributes' keys. It turned out that this was a problem with the operator attributes: They had not been registered on the Python side, so Python representation treated them as opaque TVM objects. This PR corrects this mistake. commit f4525dd8a3e61f572b50107555cef4b469c971f4 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Sep 14 17:24:40 2022 -0400 [VM][Benchmarking] Add option for saving e2e results as CSV file (#247) This PR makes some small additions to the end-to-end AutoTIR script, namely eliminating a bug (it was incorrectly using the stateful API) and adding an option to save the test results as a CSV file for benchmarking purposes (the data can then be separately analyzed as needed). These changes also required a small extension to the save_function method in the VM, namely allowing it to take keyword arguments. commit f1ee4b6cd2c3ee0596cef6f5b7ff7e715fb4ae0d Author: Ruihang Lai <[email protected]> Date: Wed Sep 14 17:23:29 2022 -0400 [BugFix] Enable emit global MatchShape (#246) Fix an incorrect check which disables emitting global MatchShape outside a dataflow block and mistakenly enables emitting dataflow MatchShape outside a dataflow block. commit 0a7a0a9daf5f1a2fa06ee6cd6169a28d397821fa Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Sep 8 09:49:05 2022 -0400 [Pass] Canonicalizing Bindings (#233) It may be useful for some passes to collapse chains of definitions, particularly after other compiler transformations that may reduce or simplify some expressions. This pass will take chains of definitions and replace references to later definitions to the original one. It works by checking `LookupBinding` for each var use-site and replacing the var with its definition if the definition was another var. (Note: This required updating `BlockBuilder` to also update its binding map for `MatchShape` nodes; that was arguably a bug.) Additionally, `MatchShape` bindings where the `LHS` and the `RHS` are guaranteed to match at compile time are canonicalized into ordinary `VarBinding`s. commit 7a6f91f7d4077eebf926aa1f19281404494b9362 Author: Prakalp Srivastava <[email protected]> Date: Thu Sep 1 07:02:57 2022 -0400 [Hexgaon] Use uploaded path to load module. (#238) * Fixes a bug to use the uploaded file remote path for loading the module remotely. * Modifies the task_python_hexagon.sh script to only run passing test on device. This is used by Jenkins CI. commit e50290140c204ae091e335b797a07f2f6567a163 Author: Lesheng Jin <[email protected]> Date: Thu Aug 18 21:51:35 2022 -0700 [Pass] New Python ExprVisitor/ExprMutator! (#190) Add decorators `visitor` and `mutator` to help users create `ExprVisitor` and `ExprMutator` in Python. Users can customize visit/rewrite/post-order-rewrite function in Python. `PyExprVisitor` and `PyExprMutator` lists the functions users can customize. commit 7313855476cc522bf3e8bdbe7a60b82cd725fe4c Author: Ruihang Lai <[email protected]> Date: Thu Aug 18 15:20:06 2022 -0400 [BugFix] Expose `relax.expr.Constant` to `relax.Constant` (#230) commit cdfd4e939f2d1e88c560a05d83ddf2f7afe70304 Author: Siyuan Feng <[email protected]> Date: Thu Aug 18 02:25:13 2022 +0800 [FIX] Fix windows build issue when allocating a dynamic array (#219) In the current codebase, kNumArgs is a runtime-dependent variable (i.e. its value depends on the input shape of Array). Allocating arrays with runtime values is not allowed during building on Windows (I'm surprised it can be compiled on Linux and macOS) commit 887762cd97686ae23a61609ca9ffc8d6a2c5178b Author: Yong Wu <[email protected]> Date: Mon Aug 15 08:00:31 2022 +0800 Update with rebase commit 5a23346bc437043b48866411e39dfcf066edda59 Author: Yuchen Jin <[email protected]> Date: Sun Aug 14 14:44:12 2022 -0700 [Bugfix][VM] Fix var binding to a ConstantNode; Force VM if.cond register to take an NDArray instead of POD. (#216) Fix the bug in #212. The cause of this bug is VM Codegen did not handle binding ConstantNode to variable (`x = relax.const([1, 2])`) and save the constant NDArray to the register. Previously the codegen only handles the case where ConstantNode as CallNode's arguments. Now it's fixed and unit test is added. Fix the bug in https://github.com/tlc-pack/relax/issues/214#issuecomment-1211411432, the issue was caused by the VM simply read the condition register of the If instruction, and expect it to be a POD int or bool. https://github.com/tlc-pack/relax/commit/811e877c289fa52f55886c8a3e8dce10ed84915f adds a `LoadScalarInt` function similar to the Relay VM to check the If.cond register stores an NDArray, and cast it to int_64. Since we haven't introduced PrimValue and PrimType (that represents POD values like int and bool) to the Relax language yet, let's enforce `If->cond` to be a Tensor (NDArray at runtime). commit 6c9d403503297a0d0e28318bafcba9fc9c99ae42 Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Aug 12 13:53:28 2022 -0400 [VM][UX] Allow for saving closures to avoid extra dictionary lookups in timing trials (#208) This PR implements a function that allows for saving a `PackedFunc` in the VM's module that just calls an existing function with a specific set of arguments to address #179 and #178. The main use of this is for timing, to avoid some overhead in looking up functions. commit e172b40af31dc3384adbcf6e7b0bce7f31ce41ea Author: Jiawei Liu <[email protected]> Date: Thu Aug 11 19:55:57 2022 -0500 [Pass][UX] Statement rewriter for DataflowBlock (#210) - Implements a few APIs to quickly perform statement-level mutation: `add`/`remove_unused`/`remove_all_unused`/`replace_all_uses`. - Implemented `remove_all_unused` to remove dead statements inside `DataflowBlock` cc: @psrivas2 - Address minor issues (unnecessary headers and bad docstrings) in https://github.com/tlc-pack/relax/pull/163 commit 37791e0a5d4a495365fd647f2cecbed16f3a3785 Author: Jiawei Liu <[email protected]> Date: Thu Aug 11 13:50:56 2022 -0500 Clean warning messages by Clang and Pylint (#215) * refact: clean clang warning in relax * refact: fix pylint * fix cpplint and clangd suggestions * fix: no cpplint on virtual-override commit 0b00715dc634aa7f091e942a54a29ee9c802ccf9 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Aug 10 11:47:37 2022 -0400 [VM][UX] Implement stateful API (#207) This PR implements the stateful API discussed in https://github.com/tlc-pack/relax/issues/179. It ensures that if you use `set_input` to set inputs, you must use `invoke_stateful` to run the function (otherwise failing) and must obtain the results using `get_output`. It handles nested tuple returns. commit ed7b77e040654582d1ab1b9535ebbc4da77da243 Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Aug 9 17:07:52 2022 -0400 [Op][Debugging] Add a print operator (#201) * Attempt at adding a print operator * Fix the registration * Actually use the format string * Improve test * Fix comment placement * Improve the docstring for relax_print * Handle tuples too * Formatting :( * Correct commit message * Match attr name across Python and C++ * Make print variadic commit a9bd3053c1106d1926fce1dc5787fc8be27f3985 Author: Sunghyun Park <[email protected]> Date: Fri Aug 5 11:45:03 2022 -0400 [Pass] Implement legacy lowering pass that leverages relay op strategy (#189) This PR implements Relax Op lowering that leverages existing Relay Op Strategy (legacy). As ops like conv2d, matmul are relay-, relax- independent, this pass assumes that we can always find relay op equivalents for such relax ops and use their info to leverage the relay op strategy. commit 1a1bcf75d97b2e7e4f758b6cd08bd747b222ef36 Author: Sunghyun Park <[email protected]> Date: Thu Aug 4 17:56:17 2022 -0400 [Pass] Introduce metaschedule as a tuning pass (#188) This PR delivers MetaSchedule tuning as a tuning passes. We can either tune at IRModule level with relax.transform.MetaScheduleTuneIRMod or tune at primfunc level with relax.transform.MetaScheduleTuneTIR. commit 7144654633477ea0d2bff300ba753dc8bfdeae4d Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Aug 4 14:34:10 2022 -0400 [Example][UX] Make the RPC timeout configurable in the `e2e_auto_tir` example (#186) Running the e2e_auto_tir example over RPC can run into issues due to timeouts because some models can take a long time to run on some machines. This PR makes the RPC timeout configurable to more easily address these issues. commit 81e565e5df90cfe12d22deb7b26845ea3aa13526 Author: Tianqi Chen <[email protected]> Date: Wed Aug 3 19:38:21 2022 -0400 Fix BlockBuilder Scope Recovery in Misuse (#199) This happens in interactive usecases. When function scope exit triggers an error, we need to recovery the BlockBuilder.current properly so users can try again. commit 21b1e7dc35dc838214cd4b6f26fbc31492323b02 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Aug 3 19:09:21 2022 -0400 [Testing][AST] Add a simple AST printer for debugging (#198) * Add ast printer * Print seq expr body * Match annotation field names to real AST * Handle call attrs and func ret types * Add more advanced test cases commit 89f55c8167a80b4b9c8751309b5db648fb4db047 Author: Jiawei Liu <[email protected]> Date: Wed Aug 3 09:59:47 2022 -0500 [UX] Adopt changes from tvm-main and render code with IPython.display (#192) Render code with IPython.display.HTML if possible to fix the ansi-escape 24-bit rendering issue in Colab. commit 0b52b558eb14b3f113a4b543c8f0a824baaa58bc Author: Jiawei Liu <[email protected]> Date: Mon Aug 1 11:59:24 2022 -0500 Dataflow Pattern Lang: Core Matching Features (#163) The structure is similar to the Relay's pattern matcher (https://github.com/apache/tvm/pull/5231). The main difference is that those pattern types are adopted to be relax-compatible. Relay pattern types, some less used patterns (IfPattern) and df-topological patterns (DominatorPattern) are ignored (some of them will be brought later). The implementation splits patterns into two parts: - **Match an Expression**: match an expression syntactically (`MatchExprPattern`, i.e., `DFPatternMatcher`); - **Match a Graph**: match a graph (cross multiple `VarBinding`) topologically (`MatchGraphPattern`); commit 74371634e9a011e63650b734aba20546b016c524 Author: Jiawei Liu <[email protected]> Date: Tue Jul 26 20:06:25 2022 -0500 [UX] Highlight TVMScript with Pygments (#185) commit 15e54ef215950944ffd74858c12c30aabcb0dcce Author: Siyuan Feng <[email protected]> Date: Sat Jul 23 11:22:13 2022 +0800 [Pass] Enhance BindParams to take numpy dict as input (#184) commit cf2e3b97110c805597059c5ba8303a653417e080 Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Jul 18 21:45:21 2022 -0400 [Bugfix][VM] Ensure set_input works over RPC by not returning an array of argument names (#183) Currently, attempting to use the VM's `set_input` method will fail over RPC because `set_input` calls `get_func_param_names`, which returns an array of parameter names. RPC does not support sending arrays. This PR corrects this issue by instead having `set_input` query the function arity and then query the argument names one by one, which is the approach taken by the Relay VM (accordingly, the names for the functions used to do this, `get_function_arity` and `get_function_param_name`, are taken from the Relay VM). This PR also adds a unit test over RPC on localhost. commit b0e57dbc0862499c3f2a7d91858354c41fcf5e95 Author: Yong Wu <[email protected]> Date: Fri Jul 15 11:50:29 2022 -0700 Fix after rebase commit 3494b7a47bf0f7c3219538b2e9064b825cf3258c Author: Sunghyun Park <[email protected]> Date: Mon Jul 18 00:38:41 2022 -0400 [Pass Infra] Tuning API serialization and database support (#168) * refactor tuning API to support serialization of Choice, Knob, Trace * Implement tuning api JSON database * Add comments * fix pylint * fix cpplint * reflect feedback * add minor comment for the future work commit 777549a6037cc97b698f53ed629cf65c33ae7eca Author: Siyuan Feng <[email protected]> Date: Mon Jul 18 00:05:14 2022 +0800 [Fix] fix windows build issue (#182) TVM_DEFINE_NOTNULLABLE_OBJECT_REF_METHODS is needed when we have a default-like constructor (e.g. (Span span = Span())) commit b81e6a9838f92ba412a0bd4951a46cc61a43a22d Author: Siyuan Feng <[email protected]> Date: Mon Jul 18 00:04:03 2022 +0800 fix print twice issue (#181) commit d4cc79ed664bbe34a4d9dab2923cd5a7a7c5b52c Author: Lesheng Jin <[email protected]> Date: Thu Jul 14 09:15:44 2022 -0700 [Pass] Python ExprMutatorBase/ExprMutator (#172) - Rewrite ExprFunctor in Python. New ExprMutatorBase and ExprMutator in Python. - Implement demo passes: RewriteFMA and FuseFMA with Python ExprMutator. - Expose some functions to ffi in block_builder.py commit 01cdc4d43258b1fb9dcc630f05f38f792e3bc513 Author: Prakalp Srivastava <[email protected]> Date: Tue Jul 12 19:25:51 2022 -0400 [VM] Deprecate API to save/load executable to file (#176) Executable `save_to_file` and `load_exec_from_file` API was used to save/load just the executable to/from file. This was confusing as it did not export the TensorIR kernels in the Relax Module, thus leading to bugs such as https://github.com/tlc-pack/relax/issues/175. Moreover, the API was only used in some tests, and not useful for end user. Deprecating this API to have a single uniform way of serializing/deserializing TVM IRModule using `export_library` and `tvm.runtime.load_module` API. commit 74b3d67e8ae74aed3446a5ae5a05b8f5586e2c3b Author: Yuchen Jin <[email protected]> Date: Fri Jul 1 09:31:30 2022 -0700 [Refactor] Generic dispatching for `IsBaseOf`; Simplify Type/Expr initializations; `relax` -> `R` in printer; Disallow local function in VMCodegen (#171) - Generic dispatching for `IsBaseOf`: `IsBaseOf` uses a bunch of if-else to check if the subtype relation between the base type and derived type, now it's changed to use a generic TypeFunctor to dispatch on the base class to do the check. - Simplify Type/Expr initializations: We had to write `RuntimeDepShape(Span()`), `ObjectType(Span())` to initialize several Types and Exprs, this is due to the `TVM_DEFINE_OBJECT_REF_METHODS` macro that sets the constructor with `= default`. By changing to use `TVM_DEFINE_NOTNULLABLE_OBJECT_REF_METHODS`, we can now just write `RuntimeDepShape()` without specifying an empty span. - `relax` -> `R` in printer: Change to print `R` rather than `relax` in TVMScript as the default behavior. This is consistent with our test cases and TIR convention: using `T` as shorthand. - Disallow generating code for local function in VMCodegen: these local functions should have been lifted in the lambda lifting pass before codegen. commit 8fdc3ba3eae0d1ffc535e240be251aaae5546eb8 Author: Prakalp Srivastava <[email protected]> Date: Thu Jun 30 15:14:40 2022 -0700 [Parser] Enable R.parser.pretty_print to print TIR PrimFunc (#174) This way we can have a uniform API to print IRModule, TensorIR function and Relax functions. commit ed0414540c9fbc063aa727cfc71bdee51a4bafdd Author: Prakalp Srivastava <[email protected]> Date: Wed Jun 29 08:20:17 2022 -0700 Update tests to use `set_input` for rpc calls. (#173) Fix relax-hexagon tests to use set_input api, which is the correct way to invoke a function over RPC. commit 1f962bda7a79d13fee1a4f9f4ad3ddde4f5467b2 Author: Sunghyun Park <[email protected]> Date: Tue Jun 28 20:49:33 2022 -0400 [BYOC][PASS] Prototype implementation of modular compilation w/ TensorRT (#164) This PR delivers the prototype of the followings: - Relax BYOC JSON codegen - Relax BYOC TensorRT codegen - Extension in Relax VM to support external modules - `RunCodegen` pass: run codegen for the annotated relax functions - Annotation (dispatch decision) will be done by earlier passes e.g., greedy heuristic, Collage - The generated runtime module and Codegen itself should be tvm object - Misc minor code improvement for other passes commit f25fe0c80670272582db3aa791901c7fa49fc59e Author: Prakalp Srivastava <[email protected]> Date: Tue Jun 28 12:47:07 2022 -0700 Run static/dynamic models over Hexagon using Relax VM RPC (#167) * Move Relax VM builtins to src/runtime. * This fixes a bug we encountered while loading the module for hexagon. Since it was building the minimal runtime it was missing definition of Relax VM builtins. * Mark Hexagon module as DSO exportable. * Load Relax VM Executable over RPC * Support allocation for shape heap on device Co-authored-by: Yuchen Jin <[email protected]> commit 25174be634b5e04f0468b48bd477f22b17e75f84 Author: Prakalp Srivastava <[email protected]> Date: Fri Jun 24 13:33:04 2022 -0700 [CI] Enable Hexagon CI in Jenkins. (#169) Running all Hexagon tests in simulator is very slow. So we only run Relax related hexagon tests `test_relax_integration.py`. This test file is empty right now and it would be populated as we push relax-hexagon related changes. commit 225aecdb5d7d33f2af048f3aef9c9a6ac758f4fd Author: Yuchen Jin <[email protected]> Date: Thu Jun 23 09:47:30 2022 -0700 [VM] Add set_input interface; Fix e2e tuning script. (#166) * Add set_input interface. * Address comments. commit 29a707cbd9be6e02dd8a3cd1961cfb53057eb51b Author: Lesheng Jin <[email protected]> Date: Thu Jun 16 09:07:45 2022 -0700 WellFormed Instrument (#165) * add conftest for test/python/relax * [Wellformed Check]: allow TupleType as Function parameters * move WellFromedInstrument to relax.ir.instrument * add header commit b4c3c4bb65b09db7c9b3ec114d6680d14f306d37 Author: Yong Wu <[email protected]> Date: Sat Jun 11 23:26:17 2022 -0700 Update after rebase commit 3c0e3c0ee08c78b17cc1ba0429727c199737403a Author: Yuchen Jin <[email protected]> Date: Sat Jun 11 18:42:29 2022 -0700 [Relay translator] Allow replacing default topi function with user-provided TIR PrimFunc. (#159) * Add replace_op_with_tir to translator. * came up with a better name * better doc. commit f250f93eed886dc2c3a1cb1f8a4ab2077c57080e Author: Yong Wu <[email protected]> Date: Sat Jun 11 15:20:21 2022 -0700 [Pass] Lambda Lifting (#99) commit b55fd31d4e11373b30a93f88412a3d6e2d21d3c1 Author: Siyuan Feng <[email protected]> Date: Tue Jun 7 10:07:17 2022 +0800 [E2E] End-to-End tuning e2e_script (#153) Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> commit d3f94e73ec7b9c9ac7b3675f962e9030e55fa603 Author: Prakalp Srivastava <[email protected]> Date: Thu Jun 2 08:19:18 2022 -0700 Fix shape lowering pass bug for non i64 dims. (#152) Prior to this change, VM Shape Lowering pass did not cast integer values to shape heap dtype (i64) which resulted in incorrect values when read from heap later. This PR adds a cast to i64 for such values. This also adds well-formed check to ensure shape dimensions are of integer types. commit 9cf777f48069d598eda276be0b9aabaf301acf0f Author: Yong Wu <[email protected]> Date: Wed Jun 1 17:52:40 2022 -0700 [Parser] Add FuncType support (#154) * [Parser] Add FuncType support * Address comments commit f99121d506df45870cd026e052f5b3c41d4bd982 Author: Sunghyun Park <[email protected]> Date: Wed Jun 1 09:01:40 2022 -0700 [PASS] Remove Unused Functions in IRModule (#151) commit a718e9f9e073ca0ea1790562254c09aaa863eaa4 Author: Sunghyun Park <[email protected]> Date: Tue May 31 15:15:28 2022 -0700 [Pass Infra] Tuning Pass API (#144) commit a485b7bdb45f8379daa45e8c923a47fd6871cbdf Author: Tianqi Chen <[email protected]> Date: Sun May 29 12:51:07 2022 -0400 [REFACTOR] Move TIR op kind analysis to relax as it is relax oriented (#155) This also keep TIR mostly independent from higher-level IR. commit abd20bdc9b87aa53e0c27e8c5c3fc195be5e8c91 Author: Siyuan Feng <[email protected]> Date: Sun May 29 23:31:05 2022 +0800 add test cases for FuseTIR (#156) commit de42ec3d5ae0f0304060460764619a5a16995a33 Author: Siyuan Feng <[email protected]> Date: Thu May 26 22:14:51 2022 +0800 [Pass] Relax Transform FuseTIR (#150) * [Pass] Relax Transform FuseTIR Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit 153d0cc8f2d39b23e63fcd6feaf9755a0eaf8c28 Author: Yuchen Jin <[email protected]> Date: Wed May 25 15:44:59 2022 -0700 [Mutator] Separate unnormalized-form and normal-form mutators (#148) commit dfa42c09a3087605e805526ab7db7b49d6752ca5 Author: Prakalp Srivastava <[email protected]> Date: Fri May 20 16:30:18 2022 -0700 Print/parse tir cast/max operations in Relax shape (#149) tir.cast and tir.max are commonly used operators in shape expression in Relax. These two operators often show up when importing Relay module with `Any` dims to Relax module. commit c7186fd44ad5865d84ac61fc2981a15c8af9be4c Author: Prakalp Srivastava <[email protected]> Date: Thu May 19 18:29:12 2022 -0700 Add support to import relay models with Any dim. (#146) Converts Relay Any dimension to symbolic dim in Relax. commit ef9cf6baba1c2f7215746459ad5a9193df6572c9 Author: Yuchen Jin <[email protected]> Date: Tue May 17 07:55:56 2022 -0700 Refactor shape lowering pass and Blockbuilder. (#145) commit 230def2284c21eaff520e58fa96a80313b6a7c8f Author: Yong Wu <[email protected]> Date: Fri May 13 14:30:05 2022 -0700 Support Closure (#140) commit 0e998988aabdeb8d913e2889eb5a9d72bee35ca2 Author: Lesheng Jin <[email protected]> Date: Thu May 12 17:13:15 2022 -0700 [Analysis] IRModule well-formed check (#142) commit 1bd4e685ffcc0c4b677af47ecc8609dbfacdfd9d Author: Yong Wu <[email protected]> Date: Wed May 11 09:31:13 2022 -0700 Change after rebase commit d0ad35b375449c7e067a1edada7502557a03dd26 Author: Siyuan Feng <[email protected]> Date: Tue May 10 08:44:22 2022 +0800 FuseOps for relax (#141) Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> commit ae7b5b79c40498203842b6c9193e91bcc1937bea Author: Prakalp Srivastava <[email protected]> Date: Wed May 4 20:52:16 2022 -0700 Add `relax.unique` operator in Relax. (#135) * Add Unique operator in Relax. This adds the functionality to register a packed function implementation of any operator using `FCallPacked` attribute. The relax operator would be lowered to a call to the registered packed function during codegen. For example, in this change relax.unique is lowered to `relax.run.unique` packed function which uses torch.unique under the hood. * Add support for integer constants in Relax VM. This adds serialization, deserialization, and print support for integer constants. commit 1ca18611ae59ab4d1667066ed9921690d2a5611c Author: Siyuan Feng <[email protected]> Date: Tue May 3 09:34:55 2022 +0800 Add ShapeType to ShapeExpr.checked_type during construction (#139) commit 6481d533ed259a080dede704f7443c4a2221a842 Author: Sunghyun Park <[email protected]> Date: Mon May 2 16:26:08 2022 -0700 Introduce Relax function attribute and drop name field in Relax function (#136) commit d735ebd719d89c804691b29ee0d881c785384fc6 Author: Yuchen Jin <[email protected]> Date: Sat Apr 30 18:45:14 2022 -0700 [BlockBuilder] Sub function call shape deduction: constant shape case. (#137) commit 10f8e56cbcb27beb373075e3c6e3a9728ffb5eb2 Author: Yuchen Jin <[email protected]> Date: Thu Apr 28 16:59:38 2022 -0700 [AST][Type] Introduce ObjectType; Infer the type of call_packed by type_args; Refactor InferType/InferShape. (#132) commit 7e2038a8b662659dd6ba2e2a86bedbc6c3891bfa Author: Yuchen Jin <[email protected]> Date: Mon Apr 25 17:20:19 2022 -0700 [AST][BlockBuilder] Normalize relax.Function; Refactor BlockBuilder to take optional input IRModule. (#133) commit f1eca6d74365c6b0665b64c86ececce86fd76df3 Author: Prakalp Srivastava <[email protected]> Date: Sun Apr 24 07:09:11 2022 -0700 [Printer][Parser] Modify Tensor annotation printing and parsing. (#128) commit 296876eaf1246ea7948c69d2111cfea2ca51ca0c Author: Lesheng Jin <[email protected]> Date: Fri Apr 22 08:05:13 2022 -0700 [Pass] Python pass decorator and ExprFunctor (#126) * Relax ExprFunctor in Python * fix the register bug * Expr_functor in relax * function/dataflowblock Pass in python * testcases * reformat * fix Tensor annotation() * add return type hint * type hint * new test * fix typo * remove memo commit 5199a206cc86cee9e43b0c8ddddf704acdc4b513 Author: Ruihang Lai <[email protected]> Date: Thu Apr 21 22:20:33 2022 +0800 [Relax][MS] Task extraction with proper weights (#129) * [Relax][MS] Task extraction with proper weights (hzfengsy#32) * Add a unit test * Update the deduplication mapping / Update the unit test * Update test for DummyDB reusing * Remove unnecessary args * Remove unused import commit badee2add6700f12671d3223e43875ca050f537a Author: Sunghyun Park <[email protected]> Date: Wed Apr 20 17:09:37 2022 -0700 [Relay Translator] Use OpStrategy for lowering (#130) * [Relay Translator] Use OpStrategy for lowering * Reflect feedback and fix lint issue * Consider contexts for PassContext, Target, .. for both pass application and lowering commit 4454563d240c547fb762cec770502b1e09b195f0 Author: Prakalp Srivastava <[email protected]> Date: Wed Apr 13 21:00:54 2022 -0700 Deprecate `[]` in favor `()` in Tensor annotation. (#123) commit fab2d95697f7eecce90cb0ba12db2457caf4f2e3 Author: Yong Wu <[email protected]> Date: Tue Apr 12 21:15:38 2022 -0700 Add tune_relax to integrate with task scheduler (#127) commit 39bab0d25f3e5bb48adf52534f2318149047f617 Author: Yong Wu <[email protected]> Date: Tue Apr 12 16:22:33 2022 -0700 Update autotir integration after rebase commit caae30f06d237c3aebd00290802122bbfdb2ae26 Author: Yuchen Jin <[email protected]> Date: Tue Apr 12 08:23:32 2022 -0700 [VM] Support sub function call and recursion. (#125) * Sub function call and recursion. * Address comment. commit e7c7c15972f6aa29f30a167a794db17f74a6bdeb Author: Ruihang Lai <[email protected]> Date: Tue Apr 12 14:18:32 2022 +0800 [VM] Copy constant tensors to device (#124) * [VM] Copy constants to device (Hzfengsy#24) * [VM] Copy constants to device * Add unit tests * Specify shape and dtype for constant TE tensors in EmitTE commit ef0a3e689b3896fd30a392d094beaa8d68b6de07 Author: Lesheng Jin <[email protected]> Date: Wed Apr 6 11:59:33 2022 -0700 DataflowBlockPass (#114) * add DataflowBlockPass * update fma_rewrite * drop the skip function * update test_fma_rewrite with DataflowBlockPass * fix the format * fix name * rewrite test in tvm script * add non-dataflow Vars check * add fail testcases * module->IRModule * add docstring to DataflowBlockNode * remove unused pattern * Transform Pass->DataflowBlock Pass * rename global var to global scope var * remove print stmt * reformat tests * add docstring to DataflowBlockMutator * fix filename * minor fix commit 2607f3b9112197045e773b0fc7ceb9ae57e844f8 Author: Yuchen Jin <[email protected]> Date: Mon Apr 4 19:59:30 2022 -0700 Remove type annotation from Var. (#121) commit 969ffb4302f35344524ef36e74325c0d5e427b76 Author: Prakalp Srivastava <[email protected]> Date: Mon Apr 4 08:33:43 2022 -0700 Add a new Expr to represent runtime dependent shapes. (#117) This can be used to represent runtime dependent shapes such as output of `unique` operator. Having explicit runtime dependent shape expression helps to distinguish the following two cases in AST - (1) shape has not been deduced (`shape_ = nullptr`), and (2) shape is runtime dependent. Previously both cases were mapped to `shape_ = nullptr`. commit 1e2a11f6326c9b3fd3807bbe5d97e4a20ce9dadd Author: Hongyi Jin <[email protected]> Date: Sun Apr 3 00:42:38 2022 +0800 [PASS] Fold constant & Bind Params (#113) * fold constant and bind params * fix test * format * format * format * address comments * format * address comment * address comment * format * fix type bug commit d441f1d0f2104b51287f9f29d9ec9f0e87f4b9d9 Author: Tianqi Chen <[email protected]> Date: Sat Apr 2 00:00:19 2022 -0400 Temporary remove function type deduction in normalizer. (#119) * Temporary remove function type deduction in normalizer. This PR temporary removes the function type deduction in normalizer to unblock some of the followup passes that needs to check function type equality. Function's checked_type_ are left as nullptr for now. We should followup to add function type deduction from annotations. * revert the normalizer skip for now * comment out parser assert for now commit 159f599248e3c6faf969198d4e7cf03c4f3f6c70 Author: Yuchen Jin <[email protected]> Date: Fri Apr 1 09:18:33 2022 -0700 [BlockBuilder] Deduce and fill shape/type for Expr in Normalize. (#116) commit 96c8bbc53286a0ca90ddcb92346156f23ab9efe3 Author: Yuchen Jin <[email protected]> Date: Wed Mar 30 11:46:50 2022 -0700 [CI] Enable GPU tests; Add AutoTIR cuda test. (#115) * Add gpu ci. * Update autotir gpu test. commit 1e5c2dac7b01f73c7e3e1a8b092eb0f2b6cc5e28 Author: Tianqi Chen <[email protected]> Date: Mon Mar 28 19:12:59 2022 -0400 [FIX] Fix structure equal hash for MatchShape (#112) The pattern field of the match shape can define variables, as a result, we need to add DefEqual and Hash here. Added a regression testcase. Lesson: we would benefit from more testcases with check_save_roundtrip checks(like this one) for more relax example. Additional change: - Redirected TVMScript printer to be able to print relax fragements useful for debugging. commit 8e466be1d1fa65b9df119e0563ef58c38e8562f2 Author: Siyuan Feng <[email protected]> Date: Tue Mar 29 01:30:07 2022 +0800 introduce blockbuilder call_te (#110) commit 6ff1614ac3c9e63ea5b615a072a1d26a197b58f9 Author: Siyuan Feng <[email protected]> Date: Sun Mar 27 00:02:53 2022 +0800 [FIX] fix structural_equal_hash (#107) * fix structural_equal_hash (cherry picked from commit e7e962634999739a32129378f61cc95f58335447) * address comment & pass the ci commit 31ed53c92192c74a3f55009e718b8ae0527ce078 Author: Yuchen Jin <[email protected]> Date: Fri Mar 25 10:49:00 2022 -0700 [Bugfix] Fix call_tir parsing bug (#109) * Fix call_tir parsing bug. * update. commit 3c7ff5a272d4b004b9b86b79e0f10c33635cea05 Author: Yuchen Jin <[email protected]> Date: Thu Mar 24 19:50:27 2022 -0700 [VM] Fix hardcoded device type in memory lowering (#106) * Add is_device field to attr. * Update. * Address comment. * update. * Update. commit 6bcdcf8d02809dbbafbbd9515ea7ada17bb00077 Author: Ruihang Lai <[email protected]> Date: Thu Mar 24 23:04:11 2022 +0800 [VM] Initialize VM through packed function (#101) commit cfc779e732933eb43cb0bca6448c51fac51dc39f Author: Yong Wu <[email protected]> Date: Tue Mar 22 19:44:37 2022 -0700 Fix after rebase commit c368324831d378033d9b0f6621f3ee3b366624e6 Author: Lesheng Jin <[email protected]> Date: Tue Mar 22 18:51:40 2022 -0700 Improve printer for DynTensorType and ShapeExpr (#97) * improve Printer for DynTensorType & ShapeExpr * add testcases commit a861f2eeadc3ded5a98aa2947a6b17f077e29dc2 Author: Ruihang Lai <[email protected]> Date: Tue Mar 22 23:16:33 2022 +0800 [VM][Refactor] Move VM files to TVM runtime directory (#98) commit d96806093e9ff50aaf4d46a89d1003f87385bf7e Author: Tianqi Chen <[email protected]> Date: Mon Mar 21 12:03:59 2022 -0400 [VM] Refactor and improve vm. (#96) * [VM] Refactor and improve vm. - Have a separate function for RunInstCall. - Cache func_index lookup by table to avoid repeative lookup by str. - Move PackedFunc call arg stack to Frame to increase locality and avoid re-allocation in repeative calls. - Make frame stack of unique_ptr to avoid frame re-allocation and copy during frame.resize. - Pass…

commit 86f1cc147255da43569d331997591fb9994229fe Author: Masahiro Masuda <[email protected]> Date: Sat Jan 21 15:22:36 2023 +0900 properly handle binding order when remapping tuple output commit dc6f3184fa8a8a3e36cebf31109ef3b5e755152b Author: Masahiro Masuda <[email protected]> Date: Sat Jan 21 05:23:47 2023 +0900 Improve merging algorithm following MergeCompilerRegion commit cf8eefbf705f0e6ba248b0cc68b52200129bdc3b Author: Masahiro Masuda <[email protected]> Date: Fri Jan 20 19:51:44 2023 +0900 more update from upstream commit 50c2b8195ed85aff83437abec801a019dcb767e6 Author: Masahiro Masuda <[email protected]> Date: Fri Jan 20 19:18:13 2023 +0900 remove WrapCompositeFunction commit ff9de42e07f7e2ff0735afbeb2c9baa3243914e8 Author: Masahiro Masuda <[email protected]> Date: Fri Jan 20 19:17:23 2023 +0900 fix commit a5f2203ceae4c2037c3667b6e02495ba065b75a4 Author: Masahiro Masuda <[email protected]> Date: Fri Jan 20 14:14:08 2023 +0900 update from upstream commit 7cc344f0aface006835c0426d56aad2a62e4ef18 Author: Masahiro Masuda <[email protected]> Date: Thu Jan 19 19:54:58 2023 +0900 Add FuseCompositeFunctions pass commit 2348c8e4387730a3a3cf15eb61c0e193984dd42f Author: Masahiro Masuda <[email protected]> Date: Thu Jan 19 19:23:58 2023 +0900 clean std::vector<JSONGraphNodeEntry> commit e61b5daa13e93e21d2a8647042d6df6bc87adea2 Author: Masahiro Masuda <[email protected]> Date: Thu Jan 19 19:09:45 2023 +0900 update commit 7abb4bbec9f472624047ed5182740b9103829cb6 Author: Masahiro Masuda <[email protected]> Date: Thu Jan 19 08:01:04 2023 +0900 simplify RunCodegen interface commit 111c5512dd57d732c60a3205a2d948e3b9d4a1c0 Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 20:20:18 2023 +0900 compile all functions at once in cutlass to support caching commit 93c5bbe0acbd3c172ea2e0e3664d2c9bae2c810f Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 19:34:56 2023 +0900 send all extern functions to codegen in one go commit 119dfdc6bf73c44a2f63a31eb15e402e1ab60eb7 Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 19:15:49 2023 +0900 refactor RunCodegen commit d4defec923ba808a054bac9381a6e599fe10cc89 Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 17:29:14 2023 +0900 extract attributes from relax func commit 4e5ef524883afbe242a0061092b2ef2e3de6b87c Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 16:09:04 2023 +0900 introduce contrib.cutlass.tune_relax_function to get annotations commit 8fe0c4611605908f051dd28262a1ca98dd369e3e Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 15:34:42 2023 +0900 thread through compile options in RunCodegen commit 6ea5190ee8a08218beaf5fd1b0d0c7773afd8788 Author: Masahiro Masuda <[email protected]> Date: Wed Jan 18 12:59:08 2023 +0900 Add WrapCompositeFunction pass commit 9fa6b44a7df2d515ee529820d2d322e858cdeae0 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 22:03:03 2023 +0900 properly handle multiple patterns (not tested) commit 37715e03879eac8523a3b2e331dde40f4eb22e71 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 21:52:04 2023 +0900 attach name to pattern commit 20e5ac0900656cb7769343291b17d84cb99dd87c Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 19:31:57 2023 +0900 clean up commit c0070146895d102ac2ee910afa56229077974828 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 17:01:40 2023 +0900 (Rebase) Squashed commit of the following: commit 5bf9c8acf12dfba9865ac9f8480341298131dec4 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 16:10:16 2023 +0900 clean up commit 5506d92ed9a4c48c63f192ddcb576c9665d4ad5b Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 15:39:39 2023 +0900 link and run compiled cutlass code, result correct commit 81d39f84ebb1a7bcfe5c2fa9f97ce2130f932dbb Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 15:13:41 2023 +0900 compile generated cutlass code commit c2a68e14575c2711497347d5fc93d15b88c6c79b Author: Masahiro Masuda <[email protected]> Date: Tue Jan 17 07:47:31 2023 +0900 codegen working commit ba26344f85ebe43f88852c8c18b754bf03df1ce1 Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 19:41:47 2023 +0900 wip commit ed3ac6d632a4798e411573f30d1a090bc05a96fc Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:53:10 2023 +0900 wip commit 47e09e54a0d405a14a602d7a6d31c49399c5662f Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:32:58 2023 +0900 wip commit b9e5df768b188de3dda1ef0d0f3db3fd592535d9 Author: Masahiro Masuda <[email protected]> Date: Mon Jan 16 17:25:37 2023 +0900 copy codegen_c base function commit fe20e653ecf548f07432f06cd17395b554e6faa5 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:43:57 2023 +0900 add cutlass stub commit 990eec78b58ca259bc067bb32e4020f28d88b7c8 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:18:57 2023 +0900 updated cutlass revision commit 591a8f1ba62d9f8e923f2dcc1702e7e7590e92e2 Author: Masahiro Masuda <[email protected]> Date: Sat Jan 14 08:02:01 2023 +0900 conv2d + relu DNNL offload works commit 1365402079626eab5bf99bad96dbfa4abd750175 Author: Masahiro Masuda <[email protected]> Date: Fri Jan 13 16:35:49 2023 +0900 starting DNNL codegen commit 4a72e7810b0df31a4fb13856b5b6320ced4e978e Author: Masahiro Masuda <[email protected]> Date: Thu Jan 12 14:02:19 2023 +0900 clean up commit 61cc55e94123f3064e0d1200c70f33b4a537c4ad Author: Masahiro Masuda <[email protected]> Date: Tue Jan 10 16:26:31 2023 +0900 pattern based partitioning working commit 2433733c5458302cbe05e534d6c99bec13fb6d36 Author: Masahiro Masuda <[email protected]> Date: Tue Jan 10 08:30:20 2023 +0900 add conv2d match & run test commit 360429440acb7068fdfd982d597523ebe032eb20 Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 17:20:05 2023 -0500 [Op][O2e] Indexing and datatype operators (#338) commit e45bdb73824d120bb3b848d4fdaa54f88211b509 Author: Tianqi Chen <[email protected]> Date: Mon Jan 9 14:59:26 2023 -0500 [VM] Supporting "compiled" exec mode. (#331) * [VM] Supporting "compiled" exec mode. This PR adds support of "compiled" mode to the VM. The compiled mode translate the relax function into TIR function and drive it through the TIR function. It is different from the micro AOT codegen, which generate TIR code that targets the micro C runtime environment and useful for resource limited settings with smaller set of features. Both leverages the low-level TIR build that is also shared with TensorIR. The current implementation targets full TVM (VM) runtime, that comes with PackedFunc, object, tuple, closure and all kinds of rich structure support. This also mean that we can leverage the full runtime support to handle things like allocation, dynamic shape, easy plugins and python interaction, which are not available in more limited runtime. The user directly use the same API to load the generated code regardless of compiled mode or bytecode. And just need to change one line ```python ex = relax.vm.build(mod, target, exec_mode="compiled") ``` Most of the codegen features are lifted before the codegen phase, so the overall implementation would be around 500 loc for each exec mode and can be further cut down with future introduction of PrimValue. The simplicity is thanks to the TVM runtime archiecture that allows us to compose things together in objects. The only difference is how the PackedFunc of high-level driving is being provided. In the case of bytecode it is normal interpretation and in the case of compiled mode it is TIR. It is a complete implementation Unit-testcases are added. All codegen build tests are updated to include two exec_modes and have passed locally. The only exception that we skipped some special packedfunc handling(printing) because can be further simplified after we introduce PrimValue. Co-authored-by: Junru Shao <[email protected]> * Address review comments Co-authored-by: Junru Shao <[email protected]> commit 32c2bf74eda5ff9cb958e6d54a29c324d53f2869 Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 13:45:14 2023 -0500 [Op][O2d] Manipulation operators (#337) As tracked by #332, this PR is the O2d milestone of the high-level operator introduction plan. This PR introduces a few manipulation operators: * broadcast_to * concat * expand_dims * flatten * permute_dims * reshape * split * squeeze These operators are all well-tested. commit b39d11a37c899a1625ecee0ffdacc5ef5444365f Author: Ruihang Lai <[email protected]> Date: Mon Jan 9 10:57:19 2023 -0500 [O2h] Neural network and linear algebra operators (#343) commit 1d6d897ec223cc07768e0382c3e21a196ffdfac8 Author: Ruihang Lai <[email protected]> Date: Sun Jan 8 20:21:50 2023 -0500 [O2g] Convolution, pooling and image operators (#341) commit 95f784ece1d61676b88b5455be3dab5e3ddbc75a Author: Ruihang Lai <[email protected]> Date: Sun Jan 8 16:53:10 2023 -0500 [Op][O2f] Set and searching operators (#339) commit be1c32d817bbbbd56329378d6d929dce79ecb0f8 Author: Siyuan Feng <[email protected]> Date: Mon Jan 9 03:38:20 2023 +0800 simple fix jupyter error reporting (#345) commit da11e4bf373349ce4142949099e29d11655aa88b Author: Siyuan Feng <[email protected]> Date: Sun Jan 8 23:09:22 2023 +0800 [TVMScript] Symbolic shape computing (#342) commit 80808fbf9a02480abf337b8a5edffe34c963feec Author: Ruihang Lai <[email protected]> Date: Sat Jan 7 18:31:00 2023 -0500 [Op][O2c] Creation operators (#336) commit 5efc8f7224f83766875e74669e139ec82119a504 Author: Ruihang Lai <[email protected]> Date: Sat Jan 7 11:14:23 2023 -0500 [TIR] Create Layout with specified axis dtype (apache/tvm#13663) (#340) commit ae71be06c8252c211642abb9d5b3e4583bdb6f6a Author: Ruihang Lai <[email protected]> Date: Fri Jan 6 16:41:18 2023 -0500 [Op][O2b] Statistical operators (#334) commit 8220df74e339cdb6dab38a803b80edc3cd6b92e2 Author: Ruihang Lai <[email protected]> Date: Thu Jan 5 18:31:48 2023 -0500 [Op][O1][O2a] Utility, arithmetic and comparison operators (#333) As tracked by #332, this PR is the kickoff part of high-level operator introduction in Relax. This PR is about the milestone O1 and O2a. Specifically, this PR * introduces some of common utility functions that the registration and StructInfo inference of each operator will often use. * introduces unary arithmetic operators: cos, log, negative, sigmoid, sin, sqrt, tanh. * refactors and introduces binary arithmetic operators: add, divide, floor_divide, multiply, subtract. * introduces binary comparative operators: equal, greater, greater_equal, less, less_equal, not_equal. These operators are well tested from three perspective: P1. the op getter can get correct op by name P2. their StructInfo inference result are as expected under all kinds of cases P3. Relax TVMScript parser can parse the scripts with the op inside For operators in O2a, most operators share almost the same StructInfo inference logic. Therefore, for tests in P2, in each category, not every op is tested in every case. For each case, it is good to have only part of op in this category tested. This is intended not to make overlarge testing file. commit f1cab0a05f05829c4c35e2a7e613bd69f2a17fae Author: Siyuan Feng <[email protected]> Date: Thu Jan 5 20:43:28 2023 +0800 [TVMScript] Ensure consistent struct info between assign lhs and rhs with sinfo annotation (#328) * [TVMScript] Ensure consistent struct info between assign lhs and rhs with sinfo annotation * fix * fix commit dc7072efe290d7e8c69d8e216311510981fc82e1 Author: Tianqi Chen <[email protected]> Date: Wed Jan 4 10:13:08 2023 -0500 [REFACTOR] Hide VM Impl, Improve execution logic. (#326) * [REFACTOR] Hide VM Impl, Improve execution logic. This PR refactors VM by hiding most of the VM implementations and improve the overall execution logic. - Unifies PackedFunc and Closure Table. - Update Closure mechanism to no longer depend on string. - Update VMMemoryLower to VMBuiltinLower to incorporate more VM intrinsic lowering, move some of the codegen intrinsic to this phase. - Allow directly pass in function index as VM instruction. * Address comment commit 2449d8c205f0b6e2c346132695b56039b07e9a10 Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Jan 3 22:04:16 2023 -0500 [IR][ASTPrinter] Tweaks to AST printer's handling of struct info (#330) commit 2d352807090ba1b7e898fbdcb83d6d9427c762cf Author: Siyuan Feng <[email protected]> Date: Tue Jan 3 23:20:47 2023 +0800 [TVMScript] Enforce `I.DeclareFunc` to have function signature (#329) commit dcae50e836a0c2999f52d96a372fc7de584951f4 Author: Tianqi Chen <[email protected]> Date: Mon Jan 2 15:21:49 2023 -0500 [BACKEND] Refactor and introduce full match-cast support. (#324) * [BACKEND] Refactor and introduce full match-cast support. This PR refactors VMShapeLower to introduce full match-cast support that enables nested tuples, type checks at argument boundary and symbolic shape computation. Along the way we also refactors cleans up some of vm codegen logic and adding unit-tests for different stages. * address comments commit a36920bf672d22e1d31e1e6f81d0447fd7a55806 Author: Siyuan Feng <[email protected]> Date: Mon Jan 2 23:31:04 2023 +0800 [TVMScript] Fix empty TupleStructInfo (#327) commit 80710a826bda66532eeda978668ed157b471b186 Author: Tianqi Chen <[email protected]> Date: Fri Dec 30 15:57:50 2022 -0500 [CONTAINER] Hash/Equal/JSON support for ShapeTuple (#325) This PR add hash/equal/json support for shape tuple. commit 343a1e7e2174612031c70ba8547577c7d21839e4 Author: Tianqi Chen <[email protected]> Date: Thu Dec 29 18:33:17 2022 -0500 [REFACTOR] StructInfo M3: MatchShape=>MatchCast (#323) * Introduce match cast, and code changes along * add match_cast parser support (#9) * Match cast support for VMShapeLower CanonicalizeBinding * Remove `match_shape` (#12) * Refactor ExprVisitor/Mutator to consider Expr in StructInfo. Co-authored-by: Siyuan Feng <[email protected]> commit e332285559d61db1c5033b8d50cd9d4af6c6b6f4 Author: Tianqi Chen <[email protected]> Date: Thu Dec 29 01:28:09 2022 -0500 [REFACTOR] StructInfo M2: Cleanups on legacy shape related items (#320) * [REFACTOR] Remove shape function * [WIP] Remove shape_, runtime_dep shape * Remove shape_ pass Compile * Remove RuntimeDepShape (#11) * BlockBuilder: remove CanProveShapeEqual, consolidate binding emit to EmitNormalize * Remove DimType, make get_shape_of API different from op.shape_of Changes the init importing to direct import so the VSCode nagivator can directly jump to the defintion point. * Apply suggestions from code review Co-authored-by: Ruihang Lai <[email protected]> * Clarify cases where struct info can be determinstically derived * Fix remaining testcases * Remove InferShape/Type per comment. Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit edadf247551f526188c0a08b3812ffc0a1f9d8bd Author: Ruihang Lai <[email protected]> Date: Fri Dec 23 14:46:07 2022 -0500 [Analysis] Optionally check structure info in well-formedness check (#321) With the introduction of structure info in #314, the well-formedness check will report malformed whenever an Expr doesn’t have defined structure info. However, when writing tests for well-formedness check and normalizer, usually we will manually construct the Exprs, which means their structure info are not defined most of the time. As a consequence, the well-formedness check will always complain “the Expr xxx doesn’t have structure info populated.” Therefore, when the checker fails to complain about the original reason of malformed, which means the checker is not working, the tests will still pass and we won’t be able to realize there is something wrong with the checker. Thus, in this PR we add an optional flag to the well-formedness check. In well-formedness tests, we will turn off the structure info check so that the original reason of being malformed will be revealed correctly. --- This PR also cleans up the DiagnosticContext parameter in the WellFormed API - the diag_ctx has been unused since the merge of #99. commit d548459a1736378398ab773dce413d90d49376cf Author: Ruihang Lai <[email protected]> Date: Fri Dec 23 07:33:25 2022 -0500 [Op] Enforce int64 output shape in CallTIR (#322) commit 10a87a455bbb84b0a0d20b22bd31784b9f4b9774 Author: Chaosfan <[email protected]> Date: Fri Dec 23 08:03:48 2022 +0800 [Bugfix] Handle function name properly in Relax TVMScript printer (#317) * remove relax_func_name_ and change logic * well_formed check for globalvar and gsymbol consistency * revise the logic in well_formed and update test * Remove `global_symbol` in test_function_attr.py * Update docs Co-authored-by: Ruihang Lai <[email protected]> commit 29aebb9d24cbf52ab21fd98996633534301ef34d Author: Tianqi Chen <[email protected]> Date: Wed Dec 21 20:21:57 2022 -0500 [REFACTOR] M1: Change parser/printer to only depend on struct info (#319) * [REFACTOR] StructInfo M1: Parser/printer/Var/Function to only depend on struct info field * Update src/relax/backend/vm/vm_shape_lower.cc Co-authored-by: Ruihang Lai <[email protected]> * Address comments * Allow function to have default value Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit e6173430f491c1d88d2ab77ce0ab43a8c602df30 Author: Tianqi Chen <[email protected]> Date: Wed Dec 21 00:42:29 2022 -0500 [REFACTOR][ARCH] Introduce StructInfo M0 (#314) * [IR] Introduce StructInfo * StructInfoFunctor and Analysis Support * [TVMScript] Parse type/shape annotation with StructInfo * remove runtime type assign * Remove type/shape during parsing (#2) * Normalizer prep: simple checks and legacy function renaming. * Struct info deduction in BlockBuilder. * Two TODOs * StructInfo Normalizer Fixes (#3) * StructInfo AST Fix * Fix Extern Func Deduction and shape mutator. * Update VoidStructInfo & globalvar (#4) * Fix passes and proper sinfo propagation. * Refactor EraseToWellDefined to Enable Remapping * [WIP] First stab at symbolic param tracking * Update EraseToWellDefined to support symbolic shape return (#5) * fix R.shape with ndim (#6) * Remove update shape/type * Address review comment, AnnotateTypeShape=>AnnotateStructInfo * Update include/tvm/script/ir_builder/relax/frame.h Co-authored-by: Ruihang Lai <[email protected]> * Address comments * Update printer to use structinfo (#7) * Update Error mechanism to prep for obj loc based reporting * Symbolic shape aware function call return value derivation. The main flow works as follows: - Match and populate shape_var_map and var_map by visit each pair of param and call arguments. - Call EraseToWellDefined to map the ret parameter to new result. * [ANALYSIS] Refactor well-form to only look at struct info. * Update comments according to reviews. * Update include/tvm/relax/struct_info.h Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Tianqi Chen <tqchen> Co-authored-by: Ruihang Lai <[email protected]> commit 151701740fac3a53b35799a82c85d86f91b720ee Author: Tianqi Chen <[email protected]> Date: Fri Dec 16 17:48:26 2022 -0500 Update relay_translator.py commit ad0f3179a84b3bc167f91c3eb082cb996b1d04e2 Author: Ruihang Lai <[email protected]> Date: Fri Dec 16 17:37:00 2022 -0500 [Translator] Remove global symbol and follow-up fix for #262 (#316) This PR removes the `global_symbol` linkage added by Relay Translator. It also fixes unaddressed comments of #262. All tests can pass locally and I believe it is safe to merge this PR directly. commit 850deded1201001d833ac65991fb1a4c6509cb1b Author: Ruihang Lai <[email protected]> Date: Fri Dec 16 16:19:48 2022 -0500 [Translator] Support translating op calls with Tuple input (#262) Previously, when a Relay function contains a Call which directly uses Tuples as arguments (the example below), ``` %25 = (%23, %24) /* ty=(Tensor[(1, 160), float32], Tensor[(1, 160), float32]) */; %26 = concatenate(%25, axis=-1) /* ty=Tensor[(1, 320), float32] */; ``` our Relay-translator is unable to generate corresponding CallTIR, because the translator always assumes a argument of a Call is mapped to a single tensor (see the code snippet below: the translator directly passes the Relax variable `new_args[-1]` to function `te_tensors`, which translate a Var to a single tensor). https://github.com/tlc-pack/relax/blob/60e9a01cdfdd013945790fc03d5abad29b8a7c0b/python/tvm/relax/testing/relay_translator.py#L124 https://github.com/tlc-pack/relax/blob/60e9a01cdfdd013945790fc03d5abad29b8a7c0b/src/relax/ir/emit_te.h#L56-L61 But in fact, the Relax variable may correspond to a Tuple of tensors, which wasn’t taken into consideration before. And such case can lead to error in `TETensor`, when creating tensors. Therefore, this PR fixes the issue by examine the Relax variable before the tensor creation of Relay Call arguments. If an argument has shape Tuple and type TupleType, we break down the tuple Variable and emit a TupleGetItem for each field, and meanwhile create a tensor for each field. commit 54a0ff551adb90937073675b4fb3d5439b814398 Author: Siyuan Feng <[email protected]> Date: Fri Dec 16 21:02:13 2022 +0800 Remove relax parser_v1 (#313) commit b363dd48aced8fb939880db8cf595ed65b7ecc77 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Dec 14 22:51:38 2022 -0500 [Debugging][Arch] Expose `shape_` fields for `TupleGetItem` and `If` nodes, fix AST printer accordingly (#311) * Make the shape of If and TupleGetItem nodes accessible in Python * Remove order-dependency from AST printer tests * Trailing whitespace commit 4bb01fe4eccdd59614cc264838a389b21dd40388 Author: Yuchen Jin <[email protected]> Date: Wed Dec 14 08:11:47 2022 -0800 [IR] Dedicated Relax Call, Constant, Tuple, TupleGetItem, If (#306) * relax.Constant. * Add callnode; * Tuple, tuplegetitem, If * mypy. * lint * rebase & fix printer. * rebase & remove virtual_device_ * address comments & leave todos. * address comments. * address comments. * tuple index. * type anno. commit 4cda8a5881fd4cd2473258b35244fc4129b6110c Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Dec 14 09:09:03 2022 -0500 [BlockBuilder][Refactor] Normalize nested `SeqExpr`s (#310) Co-authored-by: Ruihang Lai <[email protected]> commit 5aab150f322526c1a7bfe6cea0f4d7a7543a7f46 Author: Ruihang Lai <[email protected]> Date: Tue Dec 13 17:06:06 2022 -0500 [ExprMutator] No prologue in VisitWithNewScope when input is SeqExpr (#305) commit 0bf1f1b784f19298117e36016a2e522f58c143fc Author: Tianqi Chen <[email protected]> Date: Tue Dec 13 15:27:05 2022 -0500 [REFACTOR] Refactor BlockBuilder (#308) commit 28d598b6a7c55f95f8f9c2ccd5c860ba5451232d Author: Siyuan Feng <[email protected]> Date: Sun Dec 11 01:28:56 2022 +0800 [Normalizer] Combine Nearby Blocks in SeqExprs (#298) commit e152c50e368454afab75425fcb0863b1c328bf4c Author: Tianqi Chen <[email protected]> Date: Thu Dec 8 19:33:18 2022 -0500 [ARCH] Add VisitBinding second-level dispatcher in Expr type. (#301) commit fed6b8fc88b824ec68260417793447dbe524c4c3 Author: Yuchen Jin <[email protected]> Date: Wed Dec 7 16:55:40 2022 -0800 [Linkage] Cleanup global_symbol attachment and linkage. (#300) * Cleanup global_symbol attachment and linkage. * lint * Add global_symbol to the main function in translation. commit e0907d4fd03af1731310647d3d0547bdff2cfaf6 Author: Tianqi Chen <[email protected]> Date: Tue Dec 6 21:35:20 2022 -0500 [ARCH] Introduce NestedMsg to robustly handle nested-tuple analysis (#295) commit 2eb99975dc1b40b83db7dcbb96b748503dcb3319 Author: Siyuan Feng <[email protected]> Date: Mon Dec 5 21:57:21 2022 +0800 [TVMScript] Update sccript printer to enable roundtrip tests (#291) commit f8ab9890e14c2533c401969ebf11dd591beff592 Author: Hongyi Jin <[email protected]> Date: Sun Nov 27 09:59:26 2022 -0500 [RUNTIME] Correctly handling export_module when exporting modules of different type (#13489) commit 9009840e654a9900009f7776a19e26f29b1e3f85 Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Dec 2 18:33:50 2022 -0500 [Debugging] Support PackedFuncType in the AST Printer (#289) commit bda0e42f05eaba657c40a850486e55c39924f3bf Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Dec 2 18:31:39 2022 -0500 [IR][Bugfix] Improvements to the normalizer and well-formed checker (#288) commit d5fe87b21546995c7a88905bd04b4e944d28a0f4 Author: Yong Wu <[email protected]> Date: Thu Dec 1 20:00:38 2022 -0800 Enforce i64 index in ShapeExpr (#281) commit 9c9eb5585501a5da0f25ca38d7d3ac8269b6714c Author: Yuchen Jin <[email protected]> Date: Thu Dec 1 11:00:47 2022 -0800 [Parser] Register memory operators to new parser. (#279) commit 28c3f68cc51d2c22936c5496debcb8c2de54040b Author: Yong Wu <[email protected]> Date: Thu Dec 1 08:55:31 2022 -0800 [TVMScript] enable the closure test (#280) * [TVMScript] enable the closure tests. commit eb9d531b2565cdd000f46e5ecae2c45b9f589abe Author: Yuchen Jin <[email protected]> Date: Thu Dec 1 05:47:05 2022 -0800 [Normalizer] Enforce all Expr have checked_type_ invariance after normalization. (#287) commit 43f81ddf4afc2f4fdb214c9f994e844f53126cdb Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Nov 21 19:25:43 2022 -0500 [Debugging][Bugfix] Debug printer improvements: Print `shape_` and `checked_type_` for all nodes and handle non-binding `MatchShape`s (#261) The initial AST printer only included the `shape_` and `checked_type_` fields for variables because of the potential for infinite recursion (`shape_` nodes can contain other expressions, which in turn have `shape_` nodes). This PR cuts off the potential recursion to allow for printing these fields for all Relax expressions, which should be more useful for debugging. This PR also fixes a bug: The AST printer previously did not handle `MatchShape` bindings that did not bind a new variable. commit 304048c33956dddb5027fec26541d57f903d8ca2 Author: YuchenJin <[email protected]> Date: Thu Nov 17 17:02:11 2022 -0800 Fix after rebase, and reorganize the TVMScript folder structure. Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> commit e7277460f0a2c7c980be9323cdf7919dc38153e2 Author: Siyuan Feng <[email protected]> Date: Thu Nov 17 00:31:32 2022 +0800 [TVMScript] Switch to the new parser (#276) * [TVMScript] Support cross-function call for relax function This PR adds support for cross-function call for relax function, by declaring a function signature (i.e. an empty function that contains params and return type/shape but w/o body.) However, the PR meets the issue of block_builder shape deduction, which does not use function `ret_shape` to infer the shape of GlobalVar Calls. commit 7152175762613130e3ba647c77cc9818312a5b06 Author: Yuchen Jin <[email protected]> Date: Sat Nov 5 16:45:33 2022 -0500 [CI] Enable Mypy type checking for Relax; Fix typing errors to pass Mypy checking. (#270) commit 6f8f6da505b835345d7709d06bdfd8dddce7e85b Author: Lesheng Jin <[email protected]> Date: Thu Nov 3 08:16:35 2022 -0700 Introduce memory primitives (#255) Introduce the memory primitives, including `relax.memory.{alloc_storage, alloc_tensor, kill_storage, kill_tensor}`. commit 48b7c158cc01532f9019a2e615f2d94766a9464c Author: Siyuan Feng <[email protected]> Date: Thu Oct 20 08:30:47 2022 +0800 [TVMScript] Update Type Annotation Behavior of the Parser (#269) This commit changes the behavior of the parser to allow type annotations, as suggested by the community. The current behavior: - Use the more refined type/shape between user annotated and deduced type/shape. The updated behavior: - Always use user annotations - Only checks if the type/shape is valid. commit 5c3079bb6e1e4eeb4dc2d9b740facb2686c67519 Author: sung <[email protected]> Date: Mon Oct 17 19:07:01 2022 -0700 Reenable autotvm silencer; fix e2e_auto_tir.py; fix lint. Co-authored-by: YuchenJin <[email protected]> commit 85b81292626ab6f23caf2b61095a6f957b61b21c Author: sung <[email protected]> Date: Mon Oct 17 18:09:34 2022 -0700 Recover: [Bugfix] Couple of bug fixes to run TVM-gen code together with BYOC (#249) commit c46ae8566582f1fcd8fcda1479943d3abb95b3b0 Author: sung <[email protected]> Date: Mon Oct 17 17:16:01 2022 -0700 Recover: [Pass] Separate ApplyHistoryBest from tuning passes (#226) commit 83bc7cb144643d5823bf06220186528923835667 Author: Junru Shao <[email protected]> Date: Sun Oct 16 22:52:56 2022 -0700 Enable Hexagon tests commit f9f4f7904ec5468a725b2ba924a619a7c5ed4e43 Author: Junru Shao <[email protected]> Date: Sat Oct 15 15:25:56 2022 -0700 Recover dropped commits [TVMScript] B4: If branch support (#263) B8: Local Function Support (#258) [TVMScript] B3: Type annotation checks (#256) [TVMScript][Parser] B1: Dataflow block (#252) [TVMScript] B2: match shape support (#251) [TVMScript] B6/B7: Symbolic shape and var shadowing (#245) [TVMScript] B5: Support relax op (#244) [TVMScript] B0: Call_tir support (#243) enhance parser error reporting (#242) [TVMScript] A1: Relax Parser infra (#240) update ci image versions. (#241) [TVMScript] B2-4: TIR IRBuilder (#239) [TVMScript] A0: Relax IRBuilder infra (#235) [TVMScript] B5-6: TIR IRBuilder (#231) [TVMScript] B1: IRBuilder (#228) [TVMScript] New Parser: Part C (#218) [TVMScript] New Parser: Part A (#221) [TVMScript] New Parser: Part B (#217) Not recovered: [Pass] Separate ApplyHistoryBest from tuning passes (#226) [Bugfix] Couple of bug fixes to run TVM-gen code together with BYOC (#249) co-authored-by: Yuchen Jin <[email protected]> co-authored-by: Siyuan Feng <[email protected]> co-authored-by: Ruihang Lai <[email protected]> commit 65a53034bc0bee9877a1bdf363c2eadcde35f226 Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Oct 13 23:06:55 2022 -0400 [Op][Debugging] Add `assert` operator (#260) It was brought up that Relay lacks an assert operator, so we may as well have one in Relax for debugging. One issue is that we can't name it "`assert`" because Python will treat it as a syntax error to have it as a field name for the "`relax`" module, i.e., `relax.assert` is a syntax error. Thus the op is named "`assert_op`," which is not ideal but serves its purpose. commit 71d96e6c0a314936fa49fd7bc1ea79069027ab12 Author: Yuchen Jin <[email protected]> Date: Wed Oct 12 05:07:33 2022 -0700 [Pass] Support Function and If in Normalize pass. (#268) * Support Function and If in Normalize pass. * Use structural equality for expr_memo_. * Change back to pointer equality for expr_memo_; Add more tests. * rebase. commit 312a344cdeec66b1330a80d34ca78556fb338e7c Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Oct 11 18:25:29 2022 -0400 [Analysis] Expose analyses related to vars in Python (#265) Previously, analyses to gather up all variables, free variables, bound variables, all global variables, and all global variables that are called had been implemented in C++ but had not been exposed in Python or tested. This PR exposes these analyses and adds tests for them. Two further changes: * The analyses previously ignored variables bound in `MatchShape` nodes; these are now treated as bindings too. * `rec_global_vars` is renamed `called_global_vars`, since the analysis itself does not check recursion. commit 132702be7e7ed0256045d7a405e532c3d5beef6d Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Oct 10 18:19:38 2022 -0400 [Expr] Allow annotating return shape on function nodes (#253) This PR adds a `ret_shape` field for specifying the shape of the function's return value. At present, we will not use this information, but by adding it into the AST, we will be able to parse the return shape and use it in the future. Parser V1 in this PR will just always list the `ret_shape` as `RuntimeDepShape`. commit 7276c9e2ee13a4754775491ca36a7aae2d55b827 Author: Steven S. Lyubomirsky <[email protected]> Date: Sat Sep 24 00:11:45 2022 -0400 [Bugfix][VM] Properly convert tensor inputs in `save_function` (#257) It was observed that closures saved using `save_function` would crash when used over RPC with the `time_evaluator`, whereas using `set_input` and `invoke_stateful` worked as normal. While I am not entirely sure why these failures happened over RPC only in `time_evaluator` (but not in other RPC trials), it became clear that `set_input` performs a conversion of input tensor values in `SetInputTensorWithIndex`, while `save_function` was not doing this. Adding this conversion fixed the observed bug. commit 7183c7ffbe896dd9b5f5742b62afe9c821dae682 Author: Josh Fromm <[email protected]> Date: Wed Sep 21 17:07:08 2022 -0700 [Call TIR] Fix bug when invoking call_tir with scalar values. (#254) This small PR changes a check in the tvmscript parser to support empty shape tuples which are used to represent scalars. I added a scalar addition test to make sure it works properly. commit 605ba8d1548efb90980f9b18ea94f1d53f9ec3ec Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Sep 14 17:27:03 2022 -0400 [Bugfix][Op] Register attributes for unique and print (#248) Attempting to use `dump_ast` on functions containing the operators `relax.unique` and `relax.print` previously crashed due to being unable to query their attributes' keys. It turned out that this was a problem with the operator attributes: They had not been registered on the Python side, so Python representation treated them as opaque TVM objects. This PR corrects this mistake. commit f4525dd8a3e61f572b50107555cef4b469c971f4 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Sep 14 17:24:40 2022 -0400 [VM][Benchmarking] Add option for saving e2e results as CSV file (#247) This PR makes some small additions to the end-to-end AutoTIR script, namely eliminating a bug (it was incorrectly using the stateful API) and adding an option to save the test results as a CSV file for benchmarking purposes (the data can then be separately analyzed as needed). These changes also required a small extension to the save_function method in the VM, namely allowing it to take keyword arguments. commit f1ee4b6cd2c3ee0596cef6f5b7ff7e715fb4ae0d Author: Ruihang Lai <[email protected]> Date: Wed Sep 14 17:23:29 2022 -0400 [BugFix] Enable emit global MatchShape (#246) Fix an incorrect check which disables emitting global MatchShape outside a dataflow block and mistakenly enables emitting dataflow MatchShape outside a dataflow block. commit 0a7a0a9daf5f1a2fa06ee6cd6169a28d397821fa Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Sep 8 09:49:05 2022 -0400 [Pass] Canonicalizing Bindings (#233) It may be useful for some passes to collapse chains of definitions, particularly after other compiler transformations that may reduce or simplify some expressions. This pass will take chains of definitions and replace references to later definitions to the original one. It works by checking `LookupBinding` for each var use-site and replacing the var with its definition if the definition was another var. (Note: This required updating `BlockBuilder` to also update its binding map for `MatchShape` nodes; that was arguably a bug.) Additionally, `MatchShape` bindings where the `LHS` and the `RHS` are guaranteed to match at compile time are canonicalized into ordinary `VarBinding`s. commit 7a6f91f7d4077eebf926aa1f19281404494b9362 Author: Prakalp Srivastava <[email protected]> Date: Thu Sep 1 07:02:57 2022 -0400 [Hexgaon] Use uploaded path to load module. (#238) * Fixes a bug to use the uploaded file remote path for loading the module remotely. * Modifies the task_python_hexagon.sh script to only run passing test on device. This is used by Jenkins CI. commit e50290140c204ae091e335b797a07f2f6567a163 Author: Lesheng Jin <[email protected]> Date: Thu Aug 18 21:51:35 2022 -0700 [Pass] New Python ExprVisitor/ExprMutator! (#190) Add decorators `visitor` and `mutator` to help users create `ExprVisitor` and `ExprMutator` in Python. Users can customize visit/rewrite/post-order-rewrite function in Python. `PyExprVisitor` and `PyExprMutator` lists the functions users can customize. commit 7313855476cc522bf3e8bdbe7a60b82cd725fe4c Author: Ruihang Lai <[email protected]> Date: Thu Aug 18 15:20:06 2022 -0400 [BugFix] Expose `relax.expr.Constant` to `relax.Constant` (#230) commit cdfd4e939f2d1e88c560a05d83ddf2f7afe70304 Author: Siyuan Feng <[email protected]> Date: Thu Aug 18 02:25:13 2022 +0800 [FIX] Fix windows build issue when allocating a dynamic array (#219) In the current codebase, kNumArgs is a runtime-dependent variable (i.e. its value depends on the input shape of Array). Allocating arrays with runtime values is not allowed during building on Windows (I'm surprised it can be compiled on Linux and macOS) commit 887762cd97686ae23a61609ca9ffc8d6a2c5178b Author: Yong Wu <[email protected]> Date: Mon Aug 15 08:00:31 2022 +0800 Update with rebase commit 5a23346bc437043b48866411e39dfcf066edda59 Author: Yuchen Jin <[email protected]> Date: Sun Aug 14 14:44:12 2022 -0700 [Bugfix][VM] Fix var binding to a ConstantNode; Force VM if.cond register to take an NDArray instead of POD. (#216) Fix the bug in #212. The cause of this bug is VM Codegen did not handle binding ConstantNode to variable (`x = relax.const([1, 2])`) and save the constant NDArray to the register. Previously the codegen only handles the case where ConstantNode as CallNode's arguments. Now it's fixed and unit test is added. Fix the bug in https://github.com/tlc-pack/relax/issues/214#issuecomment-1211411432, the issue was caused by the VM simply read the condition register of the If instruction, and expect it to be a POD int or bool. https://github.com/tlc-pack/relax/commit/811e877c289fa52f55886c8a3e8dce10ed84915f adds a `LoadScalarInt` function similar to the Relay VM to check the If.cond register stores an NDArray, and cast it to int_64. Since we haven't introduced PrimValue and PrimType (that represents POD values like int and bool) to the Relax language yet, let's enforce `If->cond` to be a Tensor (NDArray at runtime). commit 6c9d403503297a0d0e28318bafcba9fc9c99ae42 Author: Steven S. Lyubomirsky <[email protected]> Date: Fri Aug 12 13:53:28 2022 -0400 [VM][UX] Allow for saving closures to avoid extra dictionary lookups in timing trials (#208) This PR implements a function that allows for saving a `PackedFunc` in the VM's module that just calls an existing function with a specific set of arguments to address #179 and #178. The main use of this is for timing, to avoid some overhead in looking up functions. commit e172b40af31dc3384adbcf6e7b0bce7f31ce41ea Author: Jiawei Liu <[email protected]> Date: Thu Aug 11 19:55:57 2022 -0500 [Pass][UX] Statement rewriter for DataflowBlock (#210) - Implements a few APIs to quickly perform statement-level mutation: `add`/`remove_unused`/`remove_all_unused`/`replace_all_uses`. - Implemented `remove_all_unused` to remove dead statements inside `DataflowBlock` cc: @psrivas2 - Address minor issues (unnecessary headers and bad docstrings) in https://github.com/tlc-pack/relax/pull/163 commit 37791e0a5d4a495365fd647f2cecbed16f3a3785 Author: Jiawei Liu <[email protected]> Date: Thu Aug 11 13:50:56 2022 -0500 Clean warning messages by Clang and Pylint (#215) * refact: clean clang warning in relax * refact: fix pylint * fix cpplint and clangd suggestions * fix: no cpplint on virtual-override commit 0b00715dc634aa7f091e942a54a29ee9c802ccf9 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Aug 10 11:47:37 2022 -0400 [VM][UX] Implement stateful API (#207) This PR implements the stateful API discussed in https://github.com/tlc-pack/relax/issues/179. It ensures that if you use `set_input` to set inputs, you must use `invoke_stateful` to run the function (otherwise failing) and must obtain the results using `get_output`. It handles nested tuple returns. commit ed7b77e040654582d1ab1b9535ebbc4da77da243 Author: Steven S. Lyubomirsky <[email protected]> Date: Tue Aug 9 17:07:52 2022 -0400 [Op][Debugging] Add a print operator (#201) * Attempt at adding a print operator * Fix the registration * Actually use the format string * Improve test * Fix comment placement * Improve the docstring for relax_print * Handle tuples too * Formatting :( * Correct commit message * Match attr name across Python and C++ * Make print variadic commit a9bd3053c1106d1926fce1dc5787fc8be27f3985 Author: Sunghyun Park <[email protected]> Date: Fri Aug 5 11:45:03 2022 -0400 [Pass] Implement legacy lowering pass that leverages relay op strategy (#189) This PR implements Relax Op lowering that leverages existing Relay Op Strategy (legacy). As ops like conv2d, matmul are relay-, relax- independent, this pass assumes that we can always find relay op equivalents for such relax ops and use their info to leverage the relay op strategy. commit 1a1bcf75d97b2e7e4f758b6cd08bd747b222ef36 Author: Sunghyun Park <[email protected]> Date: Thu Aug 4 17:56:17 2022 -0400 [Pass] Introduce metaschedule as a tuning pass (#188) This PR delivers MetaSchedule tuning as a tuning passes. We can either tune at IRModule level with relax.transform.MetaScheduleTuneIRMod or tune at primfunc level with relax.transform.MetaScheduleTuneTIR. commit 7144654633477ea0d2bff300ba753dc8bfdeae4d Author: Steven S. Lyubomirsky <[email protected]> Date: Thu Aug 4 14:34:10 2022 -0400 [Example][UX] Make the RPC timeout configurable in the `e2e_auto_tir` example (#186) Running the e2e_auto_tir example over RPC can run into issues due to timeouts because some models can take a long time to run on some machines. This PR makes the RPC timeout configurable to more easily address these issues. commit 81e565e5df90cfe12d22deb7b26845ea3aa13526 Author: Tianqi Chen <[email protected]> Date: Wed Aug 3 19:38:21 2022 -0400 Fix BlockBuilder Scope Recovery in Misuse (#199) This happens in interactive usecases. When function scope exit triggers an error, we need to recovery the BlockBuilder.current properly so users can try again. commit 21b1e7dc35dc838214cd4b6f26fbc31492323b02 Author: Steven S. Lyubomirsky <[email protected]> Date: Wed Aug 3 19:09:21 2022 -0400 [Testing][AST] Add a simple AST printer for debugging (#198) * Add ast printer * Print seq expr body * Match annotation field names to real AST * Handle call attrs and func ret types * Add more advanced test cases commit 89f55c8167a80b4b9c8751309b5db648fb4db047 Author: Jiawei Liu <[email protected]> Date: Wed Aug 3 09:59:47 2022 -0500 [UX] Adopt changes from tvm-main and render code with IPython.display (#192) Render code with IPython.display.HTML if possible to fix the ansi-escape 24-bit rendering issue in Colab. commit 0b52b558eb14b3f113a4b543c8f0a824baaa58bc Author: Jiawei Liu <[email protected]> Date: Mon Aug 1 11:59:24 2022 -0500 Dataflow Pattern Lang: Core Matching Features (#163) The structure is similar to the Relay's pattern matcher (https://github.com/apache/tvm/pull/5231). The main difference is that those pattern types are adopted to be relax-compatible. Relay pattern types, some less used patterns (IfPattern) and df-topological patterns (DominatorPattern) are ignored (some of them will be brought later). The implementation splits patterns into two parts: - **Match an Expression**: match an expression syntactically (`MatchExprPattern`, i.e., `DFPatternMatcher`); - **Match a Graph**: match a graph (cross multiple `VarBinding`) topologically (`MatchGraphPattern`); commit 74371634e9a011e63650b734aba20546b016c524 Author: Jiawei Liu <[email protected]> Date: Tue Jul 26 20:06:25 2022 -0500 [UX] Highlight TVMScript with Pygments (#185) commit 15e54ef215950944ffd74858c12c30aabcb0dcce Author: Siyuan Feng <[email protected]> Date: Sat Jul 23 11:22:13 2022 +0800 [Pass] Enhance BindParams to take numpy dict as input (#184) commit cf2e3b97110c805597059c5ba8303a653417e080 Author: Steven S. Lyubomirsky <[email protected]> Date: Mon Jul 18 21:45:21 2022 -0400 [Bugfix][VM] Ensure set_input works over RPC by not returning an array of argument names (#183) Currently, attempting to use the VM's `set_input` method will fail over RPC because `set_input` calls `get_func_param_names`, which returns an array of parameter names. RPC does not support sending arrays. This PR corrects this issue by instead having `set_input` query the function arity and then query the argument names one by one, which is the approach taken by the Relay VM (accordingly, the names for the functions used to do this, `get_function_arity` and `get_function_param_name`, are taken from the Relay VM). This PR also adds a unit test over RPC on localhost. commit b0e57dbc0862499c3f2a7d91858354c41fcf5e95 Author: Yong Wu <[email protected]> Date: Fri Jul 15 11:50:29 2022 -0700 Fix after rebase commit 3494b7a47bf0f7c3219538b2e9064b825cf3258c Author: Sunghyun Park <[email protected]> Date: Mon Jul 18 00:38:41 2022 -0400 [Pass Infra] Tuning API serialization and database support (#168) * refactor tuning API to support serialization of Choice, Knob, Trace * Implement tuning api JSON database * Add comments * fix pylint * fix cpplint * reflect feedback * add minor comment for the future work commit 777549a6037cc97b698f53ed629cf65c33ae7eca Author: Siyuan Feng <[email protected]> Date: Mon Jul 18 00:05:14 2022 +0800 [Fix] fix windows build issue (#182) TVM_DEFINE_NOTNULLABLE_OBJECT_REF_METHODS is needed when we have a default-like constructor (e.g. (Span span = Span())) commit b81e6a9838f92ba412a0bd4951a46cc61a43a22d Author: Siyuan Feng <[email protected]> Date: Mon Jul 18 00:04:03 2022 +0800 fix print twice issue (#181) commit d4cc79ed664bbe34a4d9dab2923cd5a7a7c5b52c Author: Lesheng Jin <[email protected]> Date: Thu Jul 14 09:15:44 2022 -0700 [Pass] Python ExprMutatorBase/ExprMutator (#172) - Rewrite ExprFunctor in Python. New ExprMutatorBase and ExprMutator in Python. - Implement demo passes: RewriteFMA and FuseFMA with Python ExprMutator. - Expose some functions to ffi in block_builder.py commit 01cdc4d43258b1fb9dcc630f05f38f792e3bc513 Author: Prakalp Srivastava <[email protected]> Date: Tue Jul 12 19:25:51 2022 -0400 [VM] Deprecate API to save/load executable to file (#176) Executable `save_to_file` and `load_exec_from_file` API was used to save/load just the executable to/from file. This was confusing as it did not export the TensorIR kernels in the Relax Module, thus leading to bugs such as https://github.com/tlc-pack/relax/issues/175. Moreover, the API was only used in some tests, and not useful for end user. Deprecating this API to have a single uniform way of serializing/deserializing TVM IRModule using `export_library` and `tvm.runtime.load_module` API. commit 74b3d67e8ae74aed3446a5ae5a05b8f5586e2c3b Author: Yuchen Jin <[email protected]> Date: Fri Jul 1 09:31:30 2022 -0700 [Refactor] Generic dispatching for `IsBaseOf`; Simplify Type/Expr initializations; `relax` -> `R` in printer; Disallow local function in VMCodegen (#171) - Generic dispatching for `IsBaseOf`: `IsBaseOf` uses a bunch of if-else to check if the subtype relation between the base type and derived type, now it's changed to use a generic TypeFunctor to dispatch on the base class to do the check. - Simplify Type/Expr initializations: We had to write `RuntimeDepShape(Span()`), `ObjectType(Span())` to initialize several Types and Exprs, this is due to the `TVM_DEFINE_OBJECT_REF_METHODS` macro that sets the constructor with `= default`. By changing to use `TVM_DEFINE_NOTNULLABLE_OBJECT_REF_METHODS`, we can now just write `RuntimeDepShape()` without specifying an empty span. - `relax` -> `R` in printer: Change to print `R` rather than `relax` in TVMScript as the default behavior. This is consistent with our test cases and TIR convention: using `T` as shorthand. - Disallow generating code for local function in VMCodegen: these local functions should have been lifted in the lambda lifting pass before codegen. commit 8fdc3ba3eae0d1ffc535e240be251aaae5546eb8 Author: Prakalp Srivastava <[email protected]> Date: Thu Jun 30 15:14:40 2022 -0700 [Parser] Enable R.parser.pretty_print to print TIR PrimFunc (#174) This way we can have a uniform API to print IRModule, TensorIR function and Relax functions. commit ed0414540c9fbc063aa727cfc71bdee51a4bafdd Author: Prakalp Srivastava <[email protected]> Date: Wed Jun 29 08:20:17 2022 -0700 Update tests to use `set_input` for rpc calls. (#173) Fix relax-hexagon tests to use set_input api, which is the correct way to invoke a function over RPC. commit 1f962bda7a79d13fee1a4f9f4ad3ddde4f5467b2 Author: Sunghyun Park <[email protected]> Date: Tue Jun 28 20:49:33 2022 -0400 [BYOC][PASS] Prototype implementation of modular compilation w/ TensorRT (#164) This PR delivers the prototype of the followings: - Relax BYOC JSON codegen - Relax BYOC TensorRT codegen - Extension in Relax VM to support external modules - `RunCodegen` pass: run codegen for the annotated relax functions - Annotation (dispatch decision) will be done by earlier passes e.g., greedy heuristic, Collage - The generated runtime module and Codegen itself should be tvm object - Misc minor code improvement for other passes commit f25fe0c80670272582db3aa791901c7fa49fc59e Author: Prakalp Srivastava <[email protected]> Date: Tue Jun 28 12:47:07 2022 -0700 Run static/dynamic models over Hexagon using Relax VM RPC (#167) * Move Relax VM builtins to src/runtime. * This fixes a bug we encountered while loading the module for hexagon. Since it was building the minimal runtime it was missing definition of Relax VM builtins. * Mark Hexagon module as DSO exportable. * Load Relax VM Executable over RPC * Support allocation for shape heap on device Co-authored-by: Yuchen Jin <[email protected]> commit 25174be634b5e04f0468b48bd477f22b17e75f84 Author: Prakalp Srivastava <[email protected]> Date: Fri Jun 24 13:33:04 2022 -0700 [CI] Enable Hexagon CI in Jenkins. (#169) Running all Hexagon tests in simulator is very slow. So we only run Relax related hexagon tests `test_relax_integration.py`. This test file is empty right now and it would be populated as we push relax-hexagon related changes. commit 225aecdb5d7d33f2af048f3aef9c9a6ac758f4fd Author: Yuchen Jin <[email protected]> Date: Thu Jun 23 09:47:30 2022 -0700 [VM] Add set_input interface; Fix e2e tuning script. (#166) * Add set_input interface. * Address comments. commit 29a707cbd9be6e02dd8a3cd1961cfb53057eb51b Author: Lesheng Jin <[email protected]> Date: Thu Jun 16 09:07:45 2022 -0700 WellFormed Instrument (#165) * add conftest for test/python/relax * [Wellformed Check]: allow TupleType as Function parameters * move WellFromedInstrument to relax.ir.instrument * add header commit b4c3c4bb65b09db7c9b3ec114d6680d14f306d37 Author: Yong Wu <[email protected]> Date: Sat Jun 11 23:26:17 2022 -0700 Update after rebase commit 3c0e3c0ee08c78b17cc1ba0429727c199737403a Author: Yuchen Jin <[email protected]> Date: Sat Jun 11 18:42:29 2022 -0700 [Relay translator] Allow replacing default topi function with user-provided TIR PrimFunc. (#159) * Add replace_op_with_tir to translator. * came up with a better name * better doc. commit f250f93eed886dc2c3a1cb1f8a4ab2077c57080e Author: Yong Wu <[email protected]> Date: Sat Jun 11 15:20:21 2022 -0700 [Pass] Lambda Lifting (#99) commit b55fd31d4e11373b30a93f88412a3d6e2d21d3c1 Author: Siyuan Feng <[email protected]> Date: Tue Jun 7 10:07:17 2022 +0800 [E2E] End-to-End tuning e2e_script (#153) Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> commit d3f94e73ec7b9c9ac7b3675f962e9030e55fa603 Author: Prakalp Srivastava <[email protected]> Date: Thu Jun 2 08:19:18 2022 -0700 Fix shape lowering pass bug for non i64 dims. (#152) Prior to this change, VM Shape Lowering pass did not cast integer values to shape heap dtype (i64) which resulted in incorrect values when read from heap later. This PR adds a cast to i64 for such values. This also adds well-formed check to ensure shape dimensions are of integer types. commit 9cf777f48069d598eda276be0b9aabaf301acf0f Author: Yong Wu <[email protected]> Date: Wed Jun 1 17:52:40 2022 -0700 [Parser] Add FuncType support (#154) * [Parser] Add FuncType support * Address comments commit f99121d506df45870cd026e052f5b3c41d4bd982 Author: Sunghyun Park <[email protected]> Date: Wed Jun 1 09:01:40 2022 -0700 [PASS] Remove Unused Functions in IRModule (#151) commit a718e9f9e073ca0ea1790562254c09aaa863eaa4 Author: Sunghyun Park <[email protected]> Date: Tue May 31 15:15:28 2022 -0700 [Pass Infra] Tuning Pass API (#144) commit a485b7bdb45f8379daa45e8c923a47fd6871cbdf Author: Tianqi Chen <[email protected]> Date: Sun May 29 12:51:07 2022 -0400 [REFACTOR] Move TIR op kind analysis to relax as it is relax oriented (#155) This also keep TIR mostly independent from higher-level IR. commit abd20bdc9b87aa53e0c27e8c5c3fc195be5e8c91 Author: Siyuan Feng <[email protected]> Date: Sun May 29 23:31:05 2022 +0800 add test cases for FuseTIR (#156) commit de42ec3d5ae0f0304060460764619a5a16995a33 Author: Siyuan Feng <[email protected]> Date: Thu May 26 22:14:51 2022 +0800 [Pass] Relax Transform FuseTIR (#150) * [Pass] Relax Transform FuseTIR Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> commit 153d0cc8f2d39b23e63fcd6feaf9755a0eaf8c28 Author: Yuchen Jin <[email protected]> Date: Wed May 25 15:44:59 2022 -0700 [Mutator] Separate unnormalized-form and normal-form mutators (#148) commit dfa42c09a3087605e805526ab7db7b49d6752ca5 Author: Prakalp Srivastava <[email protected]> Date: Fri May 20 16:30:18 2022 -0700 Print/parse tir cast/max operations in Relax shape (#149) tir.cast and tir.max are commonly used operators in shape expression in Relax. These two operators often show up when importing Relay module with `Any` dims to Relax module. commit c7186fd44ad5865d84ac61fc2981a15c8af9be4c Author: Prakalp Srivastava <[email protected]> Date: Thu May 19 18:29:12 2022 -0700 Add support to import relay models with Any dim. (#146) Converts Relay Any dimension to symbolic dim in Relax. commit ef9cf6baba1c2f7215746459ad5a9193df6572c9 Author: Yuchen Jin <[email protected]> Date: Tue May 17 07:55:56 2022 -0700 Refactor shape lowering pass and Blockbuilder. (#145) commit 230def2284c21eaff520e58fa96a80313b6a7c8f Author: Yong Wu <[email protected]> Date: Fri May 13 14:30:05 2022 -0700 Support Closure (#140) commit 0e998988aabdeb8d913e2889eb5a9d72bee35ca2 Author: Lesheng Jin <[email protected]> Date: Thu May 12 17:13:15 2022 -0700 [Analysis] IRModule well-formed check (#142) commit 1bd4e685ffcc0c4b677af47ecc8609dbfacdfd9d Author: Yong Wu <[email protected]> Date: Wed May 11 09:31:13 2022 -0700 Change after rebase commit d0ad35b375449c7e067a1edada7502557a03dd26 Author: Siyuan Feng <[email protected]> Date: Tue May 10 08:44:22 2022 +0800 FuseOps for relax (#141) Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> commit ae7b5b79c40498203842b6c9193e91bcc1937bea Author: Prakalp Srivastava <[email protected]> Date: Wed May 4 20:52:16 2022 -0700 Add `relax.unique` operator in Relax. (#135) * Add Unique operator in Relax. This adds the functionality to register a packed function implementation of any operator using `FCallPacked` attribute. The relax operator would be lowered to a call to the registered packed function during codegen. For example, in this change relax.unique is lowered to `relax.run.unique` packed function which uses torch.unique under the hood. * Add support for integer constants in Relax VM. This adds serialization, deserialization, and print support for integer constants. commit 1ca18611ae59ab4d1667066ed9921690d2a5611c Author: Siyuan Feng <[email protected]> Date: Tue May 3 09:34:55 2022 +0800 Add ShapeType to ShapeExpr.checked_type during construction (#139) commit 6481d533ed259a080dede704f7443c4a2221a842 Author: Sunghyun Park <[email protected]> Date: Mon May 2 16:26:08 2022 -0700 Introduce Relax function attribute and drop name field in Relax function (#136) commit d735ebd719d89c804691b29ee0d881c785384fc6 Author: Yuchen Jin <[email protected]> Date: Sat Apr 30 18:45:14 2022 -0700 [BlockBuilder] Sub function call shape deduction: constant shape case. (#137) commit 10f8e56cbcb27beb373075e3c6e3a9728ffb5eb2 Author: Yuchen Jin <[email protected]> Date: Thu Apr 28 16:59:38 2022 -0700 [AST][Type] Introduce ObjectType; Infer the type of call_packed by type_args; Refactor InferType/InferShape. (#132) commit 7e2038a8b662659dd6ba2e2a86bedbc6c3891bfa Author: Yuchen Jin <[email protected]> Date: Mon Apr 25 17:20:19 2022 -0700 [AST][BlockBuilder] Normalize relax.Function; Refactor BlockBuilder to take optional input IRModule. (#133) commit f1eca6d74365c6b0665b64c86ececce86fd76df3 Author: Prakalp Srivastava <[email protected]> Date: Sun Apr 24 07:09:11 2022 -0700 [Printer][Parser] Modify Tensor annotation printing and parsing. (#128) commit 296876eaf1246ea7948c69d2111cfea2ca51ca0c Author: Lesheng Jin <[email protected]> Date: Fri Apr 22 08:05:13 2022 -0700 [Pass] Python pass decorator and ExprFunctor (#126) * Relax ExprFunctor in Python * fix the register bug * Expr_functor in relax * function/dataflowblock Pass in python * testcases * reformat * fix Tensor annotation() * add return type hint * type hint * new test * fix typo * remove memo commit 5199a206cc86cee9e43b0c8ddddf704acdc4b513 Author: Ruihang Lai <[email protected]> Date: Thu Apr 21 22:20:33 2022 +0800 [Relax][MS] Task extraction with proper weights (#129) * [Relax][MS] Task extraction with proper weights (hzfengsy#32) * Add a unit test * Update the deduplication mapping / Update the unit test * Update test for DummyDB reusing * Remove unnecessary args * Remove unused import commit badee2add6700f12…

The structure is similar to the Relay's pattern matcher (apache/tvm#5231). The main difference is that those pattern types are adopted to be relax-compatible. Relay pattern types, some less used patterns (IfPattern) and df-topological patterns (DominatorPattern) are ignored (some of them will be brought later). The implementation splits patterns into two parts: - **Match an Expression**: match an expression syntactically (`MatchExprPattern`, i.e., `DFPatternMatcher`); - **Match a Graph**: match a graph (cross multiple `VarBinding`) topologically (`MatchGraphPattern`);

mbrookhart force-pushed the mbrookhart/pattern_matcher branch from 175edf0 to 76f1b85 Compare April 3, 2020 15:33

mbrookhart changed the title ~~[WIP][POC] Pattern Language and Matcher~~ [WIP][POC] Pattern Language, Matcher, and Rewriter Apr 3, 2020

mbrookhart force-pushed the mbrookhart/pattern_matcher branch from 0414728 to 3af18d4 Compare April 6, 2020 23:56

tqchen added the status: WIP label Apr 10, 2020

mbrookhart changed the title ~~[WIP][POC] Pattern Language, Matcher, and Rewriter~~ [POC] Pattern Language, Matcher, and Rewriter V0 Apr 14, 2020

masahi reviewed Apr 14, 2020

View reviewed changes

src/relay/ir/dataflow_matcher.cc Show resolved Hide resolved

masahi reviewed Apr 14, 2020

View reviewed changes

src/relay/ir/dataflow_matcher.cc Show resolved Hide resolved

masahi reviewed Apr 14, 2020

View reviewed changes

src/relay/ir/dataflow_matcher.cc Outdated Show resolved Hide resolved

masahi reviewed Apr 14, 2020

View reviewed changes

src/relay/ir/dataflow_matcher.cc Outdated Show resolved Hide resolved

masahi reviewed Apr 14, 2020

View reviewed changes

src/relay/ir/dataflow_matcher.cc Show resolved Hide resolved