-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5/6] Arm(R) Ethos(TM)-U NPU codegen integration #8849
Conversation
1e08789
to
d6d8bec
Compare
7522dda
to
cd1a9c8
Compare
@Mousius @grant-arm , #8274 fails test_networks.py . I cant immediately figure out why. See if you guys see something. cc: @mbaret |
From the looks of it, TVMBAW calls are not generated any more for static allocates for kDLCPU. I wonder why the AoT tests did not break. |
*/ | ||
void SaveToFile(const std::string& file_name, const std::string& format) final { | ||
std::string fmt = GetFileFormat(file_name, format); | ||
LOG(INFO) << "format=" << fmt << ";;\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like some debug logging that was left in by mistake.
It should probably be removed.
cd1a9c8
to
0523597
Compare
@@ -113,16 +113,6 @@ class BuiltinLower : public StmtExprMutator { | |||
op = stmt.as<AllocateNode>(); | |||
// Get constant allocation bound. | |||
int64_t nbytes = GetVectorBytes(op->dtype); | |||
if (device_type_.defined()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @areusch @mbrookhart this is the code we discussed yesterday, currently CI passes with this code? that seems to contradict the conversations we had, but also I think these should be stack allocations for performance on cloud cpus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we want the allocates to be placed on stack in CPU PrimFuncs, maybe we should make them have storage_scope = 'local' and we should generate TVMBAWs for 'global' allocates -- that could be made to work for both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #9022.
Just checking your use case: You want to see the allocas so you can later rewrite them, right? It's a bit confusing because I thought if the final c code generator sees these Allocates it rewrites them to use globals which I assumed was what you wanted.
|
||
ss << "#include <stdio.h>\n"; | ||
ss << "#include <stdlib.h>\n"; | ||
ss << "#include <dlpack/dlpack.h>\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dlpack.h is no longer required.
7824786
to
962a7ff
Compare
This commit integrates the codegen for Arm® Ethos™-U. * Adding Conv2D tests and a mobilenet_v1 conv2d offload test. Co-authored-by: Grant Watson <[email protected]> Co-authored-by: Leandro Nunes <[email protected]> Co-authored-by: Christopher Sidebottom <[email protected]> Co-authored-by: Matthew Barret <[email protected]>
962a7ff
to
4f67889
Compare
module=module, | ||
inputs=inputs, | ||
outputs=outputs, | ||
output_tolerance=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default value of output_tolerance should be 0 and build_source should have an optional parameter to set it to something else, if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo some nits and comment requests.
I didn't dig into the tests much, esp the reference_system stuff, will trust your local reviewers.
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
#include <dmlc/filesystem.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: file level preamble comment
/*! | ||
* \brief The ethos runtime module. | ||
* | ||
* \param cmms A array of external symbol 1, serialized command stream 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment seems garbled, might as well explain them all here or on the fields.
namespace tvm { | ||
namespace runtime { | ||
|
||
class EthosUModuleNode : public ModuleNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview comment, especially since this is doing the final host codegen.
|
||
private: | ||
String c_source; | ||
Array<String> func_names_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth changing to a single main_func_name_ since a deep assumption throughout?
|
||
|
||
@tvm._ffi.register_func("relay.ext.ethosu.constant_updater") | ||
def constant_updater(expr, symbol): # pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move below main entry point and comment it is called back from utils.h UpdateConstants after lowering.
} | ||
|
||
TVM_REGISTER_GLOBAL("runtime.module.ethosu.create").set_body([](TVMArgs args, TVMRetValue* rv) { | ||
*rv = EthosUModuleNode::Create(args[0], args[1], args[2], args[3], args[4], args[5]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider TypedPackedFunc
|
||
|
||
@tvm._ffi.register_func("relay.ext.ethosu") | ||
def ethosu_compiler(ref): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update argument name/improve doc string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits, otherwise mostly Ethos-U specific and looks good to me
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
"""Test infrastructure for Arm(R) Ethos(TM)-U NPU related tests""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: grammar
Thanks all! |
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
* main: Fix flaky NMS test by making sure scores are unique (apache#9140) [Relay] Merge analysis/context_analysis.cc and transforms/device_annotation.cc (apache#9038) [LLVM] Make changes needed for opaque pointers (apache#9138) Arm(R) Ethos(TM)-U NPU codegen integration (apache#8849) [CI] Split Integration tests out of first phase of pipeline (apache#9128) [Meta Schedule][M3b] Runner (apache#9111) Fix Google Mock differences between Ubuntu 18.04 and 16.04 (apache#9141) [TIR] add loop partition hint pragma (apache#9121) fix things (apache#9146) [Meta Schedule][M3a] SearchStrategy (apache#9132) [Frontend][PyTorch] support for quantized conv_transpose2d op (apache#9133) [UnitTest] Parametrized test_conv2d_int8_intrinsics (apache#9143) [OpenCL] Remove redundant visit statement in CodeGen. (apache#9144) [BYOC] support arbitrary input dims for add/mul/relu of dnnl c_src codegen (apache#9127) [Relay][ConvertLayout] Support for qnn.conv2d_transpose (apache#9139) add nn.global_avgpool to fq2i (apache#9137) [UnitTests] Enable minimum testing on Vulkan target in CI (apache#9093) [Torch] Support returning quantized weights and bias for BYOC use cases (apache#9135) [Relay] Prepare for new plan_devices.cc (part II) (apache#9130) [microTVM][Zephyr] Add MIMXRT1050 board support (apache#9068)
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
* main: (80 commits) Introduce centralised name transformation functions (apache#9088) [OpenCL] Add vectorization to cuda conv2d_nhwc schedule (apache#8636) [6/6] Arm(R) Ethos(TM)-U NPU codegen integration with `tvmc` (apache#8854) [microTVM] Add wrapper for creating project using a MLF (apache#9090) Fix typo (apache#9156) [Hotfix][Testing] Wait for RPCServer to be established (apache#9150) Update find cublas so it search default path if needed. (apache#9149) [TIR][LowerMatchBuffer] Fix lowering strides when source region has higher dimension than the buffer (apache#9145) Fix flaky NMS test by making sure scores are unique (apache#9140) [Relay] Merge analysis/context_analysis.cc and transforms/device_annotation.cc (apache#9038) [LLVM] Make changes needed for opaque pointers (apache#9138) Arm(R) Ethos(TM)-U NPU codegen integration (apache#8849) [CI] Split Integration tests out of first phase of pipeline (apache#9128) [Meta Schedule][M3b] Runner (apache#9111) Fix Google Mock differences between Ubuntu 18.04 and 16.04 (apache#9141) [TIR] add loop partition hint pragma (apache#9121) fix things (apache#9146) [Meta Schedule][M3a] SearchStrategy (apache#9132) [Frontend][PyTorch] support for quantized conv_transpose2d op (apache#9133) [UnitTest] Parametrized test_conv2d_int8_intrinsics (apache#9143) ...
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This is a follow up commit to address the comments of #8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This commit integrates the codegen for Arm® Ethos™-U. * Adding Conv2D tests and a mobilenet_v1 conv2d offload test. Co-authored-by: Grant Watson <[email protected]> Co-authored-by: Leandro Nunes <[email protected]> Co-authored-by: Christopher Sidebottom <[email protected]> Co-authored-by: Matthew Barret <[email protected]> Co-authored-by: Grant Watson <[email protected]> Co-authored-by: Leandro Nunes <[email protected]> Co-authored-by: Christopher Sidebottom <[email protected]> Co-authored-by: Matthew Barret <[email protected]>
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This commit integrates the codegen for Arm® Ethos™-U. * Adding Conv2D tests and a mobilenet_v1 conv2d offload test. Co-authored-by: Grant Watson <[email protected]> Co-authored-by: Leandro Nunes <[email protected]> Co-authored-by: Christopher Sidebottom <[email protected]> Co-authored-by: Matthew Barret <[email protected]> Co-authored-by: Grant Watson <[email protected]> Co-authored-by: Leandro Nunes <[email protected]> Co-authored-by: Christopher Sidebottom <[email protected]> Co-authored-by: Matthew Barret <[email protected]>
This is a follow up commit to address the comments of apache#8849 Change-Id: I02d8de64f3bce0e7b544d652eee8737ec1ecbb80
This commit integrates the codegen for Arm® Ethos™-U.
This PR is blocked on merging #8795 , #8806 and #8811. Since this is built on top of all those PR branches, this contains the accumulation of contents of the all the PRs.
Co-authored-by: Grant Watson [email protected]
Co-authored-by: Leandro Nunes [email protected]
Co-authored-by: Christopher Sidebottom [email protected]
Co-authored-by: Matthew Barrett [email protected]