Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2/10] CMSIS-NN code generator for softmax #8833

Merged
merged 1 commit into from
Sep 7, 2021

Conversation

ashutosh-arm
Copy link
Contributor

@ashutosh-arm ashutosh-arm commented Aug 24, 2021

This PR is to support softmax (int8) via CMSIS-NN. It also includes changes for CMSIS-NN graph partitioner since that PR has not merged yet: #8653

This includes second and third steps of plan mentioned in the RFC: apache/tvm-rfcs#15

Co-authored-by: Chris Sidebottom [email protected]

@ashutosh-arm
Copy link
Contributor Author

Adding @manupa-arm @grant-arm @areusch for review.

@ashutosh-arm ashutosh-arm changed the title Softmax CMSIS-NN code generator for softmax Aug 24, 2021
@ashutosh-arm ashutosh-arm changed the title CMSIS-NN code generator for softmax [2/10] CMSIS-NN code generator for softmax Aug 24, 2021
Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @ashutosh-arm, really enjoying the code re-use here 😸 Just needs neatening up a bit from my POV.

python/tvm/relay/backend/contrib/cmsisnn/codegen.py Outdated Show resolved Hide resolved
python/tvm/relay/backend/contrib/cmsisnn/codegen.py Outdated Show resolved Hide resolved
src/relay/backend/contrib/cmsisnn/codegen_cmsisnn.cc Outdated Show resolved Hide resolved
src/relay/backend/contrib/cmsisnn/codegen_cmsisnn.cc Outdated Show resolved Hide resolved
tests/python/contrib/test_cmsisnn/test_softmax.py Outdated Show resolved Hide resolved
tests/python/contrib/test_cmsisnn/test_softmax.py Outdated Show resolved Hide resolved
Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion and I think you need to run the linters against the code again @ashutoshparkhi 😸

src/relay/backend/contrib/cmsisnn/codegen_cmsisnn.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashutosh-arm thanks for the PR. a few questions on the approach

tvmc_package = tvmc.compiler.compile_model(
tvmc_model,
target=f"cmsis-nn, c -runtime=c --system-lib --link-params -mcpu=cortex-m55 --executor=aot",
output_format="mlf",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question: i'd like to include a list of libraries required in the MLF (separate PR). is that something you guys are okay with? cc @Mousius @leandron @mehrdadh @guberti

int32_t input_bits = 5;
double beta_multiplier = (beta * quant_scale * (1 << (31 - input_bits)));
beta_multiplier = std::min<double>(beta_multiplier, (1ll << 31) - 1.0);
auto mult_shift_pair = tvm::relay::qnn::GetFixedPointMultiplierShift(beta_multiplier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jroesch @electriclilies @tqchen what's our stance on adding this much complexity to the codegen? can you comment whether it's okay to invoke tvm::relay::qnn::GetFixedPointMultiplierShift from here?

in general, we should favor putting as much validation logic in the compiler as possible. it kind of seems like this should move upwards though, so that the codegen is just emitting tir.call_extern with the same arguments as are in the CallNode? Doing that would make it possible to emit code using llvm codegen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the second point, we were planning on moving TIR generation from python implementation to C++ when we go to next operator's support. I can move this call to TIR then if others are okay with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashutosh-arm @Mousius and I discussed about this a bit. This is a smaller instance of a more general problem: other CMSIS-NN operators require user-defined data structures. This is the only thing stopping them from being encoded as tir.call_extern and therefore allowing reuse of the c and llvm codegen.

@ZihengJiang was working on this but I believe it's been de-prioritized. my preference is not to place call-specific transformation logic inside of codegen and instead ensure the tir.CallNode contains the final parameters where possible. might it be possible to work around this either:
a) by creating a custom TIR Node class and using this as a parameter, knowing that this parameter can only be consumed by CMSIS-NN codegen
b) by creating a placeholder parameter and attaching the struct as an attribute to the CallNode

cc @mbs-octoml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the data structure bits. Note that most of the data structures are passed as pointers(handle), this means we likely only need intrinsics to construct those data structures(return handles), then pass as argument to the call.

We would prefer this way of reusing existing construct(via composing intrinsics) than adding semantics to the TIR when possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen the issue with this approach is that we want to stack-allocate these structs because we don't want to call out to the memory manager. so while they're passed by pointer, we still need to account for the memory usage of those structs.

we probably need to discuss this a bit more before merging calls with structs, but I think we understand enough to unblock this PR. I discussed with @ashutosh-arm and he'll update this PR to move this call out of the codegen. The next CMSIS-NN call will require us to fill a struct. This way, when we move to consider that call, we'll start from a clean slate where the checked-in codegen doesn't further require a cleanup to meet the intended codegen pattern and therefore remove that from the discussion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR has already been reviewed. I've created another one here: #8951 that contains changes to move TIR code generation to C++ side. This PR also has the change to move out the CMSIS-NN Softmax's parameter calculations out of C codegen.

@areusch areusch merged commit 1dc1707 into apache:main Sep 7, 2021
@areusch
Copy link
Contributor

areusch commented Sep 7, 2021

thanks @ashutosh-arm for working through this with us! we will continue review in the next PR.

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
@ashutosh-arm ashutosh-arm deleted the softmax branch December 7, 2021 10:10
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants