Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autotvm:Cannot find tuning records for: target=c -keys=cpu -model=esp32 (AIV-653) #139

Closed
ShawnHymel opened this issue Sep 29, 2023 · 7 comments

Comments

@ShawnHymel
Copy link

I am following the guide here: https://docs.espressif.com/projects/esp-dl/en/latest/esp32s3/tutorials/deploying-models-through-tvm.html.I trained a basic 2-layer DNN on MNIST using torch and exported the model in ONNX format. I saved a representative dataset train_set.npy and a single sample as sample.npy. I optimized and quantized the model with the following:

python3.7 -m onnxruntime.quantization.preprocess --input model.onnx --output model_opt.onnx
python3.7 ./esp-dl/tools/tvm/esp_quantize_onnx.py --input_model model_opt.onnx --output_model model_quant.onnx --calibrate_dataset train_set.npy

I then try to convert the quantized model to ESP-DL format with the following:

python3.7 esp-dl/tools/tvm/export_onnx_model.py --target_chip esp32s3 --model_path model_quant.onnx --img_path sample.npy --template_path ./esp-dl/tools/tvm/template_project_for_model --out_path esp32-inference-project

When I enable logging in the export_onnx_model.py script, I get the following:

Model Information:
------------------
Input Name: onnx::Gemm_0
Input Shape: (1, 784)
Input DType: float
Output Name: 7
Output Shape: (1, 10)
Output DType: float
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for subtract based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for subtract based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
WARNING:autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int16'), ('TENSOR', (256, 784), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 256), 'int16'), ('TENSOR', (10, 256), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
INFO:te_compiler:Using layout_transform.generic for layout_transform based on highest priority (10)
INFO:te_compiler:Using layout_transform.generic for layout_transform based on highest priority (10)
INFO:te_compiler:Using injective.cpu for divide based on highest priority (10)
INFO:te_compiler:Using injective.cpu for round based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int16'), ('TENSOR', (32, 784, 8), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.contrib_dense_pack based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for nn.relu based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 256), 'int16'), ('TENSOR', (2, 256, 5), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.contrib_dense_pack based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
esp_dl_library_path: /content/esp-dl
generated project in: ./esp32-inference-project/new_project

It looks like the model is converted. However, autotvm complains about not finding tuning records:

WARNING:autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int16'), ('TENSOR', (256, 784), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.

This does not appear to be the intended behavior of the script. Any suggestions on how to fix this issue?

@github-actions github-actions bot changed the title autotvm:Cannot find tuning records for: target=c -keys=cpu -model=esp32 autotvm:Cannot find tuning records for: target=c -keys=cpu -model=esp32 (AIV-653) Sep 29, 2023
@ShawnHymel
Copy link
Author

If it helps, here is my Colab script that trains a model in torch, converts it to ONNX, and attempts to convert it to ESP-DL with a template project: https://colab.research.google.com/gist/ShawnHymel/82d1a11278da45831f0d943d44ea2cc1/pytorch-mnist-onnx-quantization.ipynb

@ShawnHymel
Copy link
Author

Not sure if it's related, but if I change the __main__ part in export_onnx_model.py to the following:

if __name__ == '__main__':
    import argparse
    
    import logging
    logging.basicConfig(level=logging.DEBUG)

    parser = argparse.ArgumentParser(description='Test Assistant')
    parser.add_argument('--target_chip', help='esp32s3')
    parser.add_argument('--model_path', help='path of onnx model')
    parser.add_argument('--img_path', help='path of npy file which stores an input image of this model')
    parser.add_argument('--template_path', help='path of template project')
    parser.add_argument('--out_path', help='path of generated project')
    args = parser.parse_args()
    
    dump_tvm_onnx_project(target_chip=args.target_chip, model_path=args.model_path, img_path=args.img_path,  template_path=args.template_path, generated_project_path=args.out_path)
    debug_onnx_model(args.target_chip, args.model_path, args.img_path)

I get the following output:

Model Information:
------------------
Input Name: onnx::Gemm_0
Input Shape: (1, 784)
Input DType: float
Output Name: 7
Output Shape: (1, 10)
Output DType: float
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for subtract based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for subtract based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
WARNING:autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int16'), ('TENSOR', (256, 784), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 256), 'int16'), ('TENSOR', (10, 256), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
INFO:te_compiler:Using layout_transform.generic for layout_transform based on highest priority (10)
INFO:te_compiler:Using layout_transform.generic for layout_transform based on highest priority (10)
INFO:te_compiler:Using injective.cpu for divide based on highest priority (10)
INFO:te_compiler:Using injective.cpu for round based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int16'), ('TENSOR', (32, 784, 8), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.contrib_dense_pack based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for nn.relu based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=c -keys=cpu -model=esp32
    key=('dense_pack.x86', ('TENSOR', (1, 256), 'int16'), ('TENSOR', (2, 256, 5), 'int16'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.contrib_dense_pack based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
esp_dl_library_path: /content/esp-dl
generated project in: /content/esp32-inference-project/new_project
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:te_compiler:Using injective.cpu for expand_dims based on highest priority (10)
INFO:autotvm:Download pre-tuned parameters package from https://raw.githubusercontent.com/tlc-pack/tophub/main/tophub/llvm_v0.04.log
INFO:download:Downloading from url https://raw.githubusercontent.com/tlc-pack/tophub/main/tophub/llvm_v0.04.log to /root/.tvm/tophub/llvm_v0.04.log
...100%, 0.02 MB, 175 KB/s, 0 seconds passedDEBUG:download:
DEBUG:autotvm:Finish loading 35 records
INFO:te_compiler:Using injective.cpu for divide based on highest priority (10)
INFO:te_compiler:Using injective.cpu for round based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=llvm -keys=cpu 
    key=('dense_pack.x86', ('TENSOR', (1, 784), 'int8'), ('TENSOR', (256, 784), 'int8'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for nn.relu based on highest priority (10)
DEBUG:autotvm:Cannot find tuning records for:
    target=llvm -keys=cpu 
    key=('dense_pack.x86', ('TENSOR', (1, 256), 'int8'), ('TENSOR', (10, 256), 'int8'), None, 'int32')
TVM will apply a default schedule which may negatively impact performance.
INFO:te_compiler:Using dense_pack.x86 for nn.dense based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
INFO:te_compiler:Using injective.cpu for zeros based on highest priority (10)
INFO:te_compiler:Using injective.cpu for greater_equal based on highest priority (10)
INFO:te_compiler:Using injective.cpu for full based on highest priority (10)
INFO:te_compiler:Using injective.cpu for where based on highest priority (10)
INFO:te_compiler:Using injective.cpu for add based on highest priority (10)
INFO:te_compiler:Using injective.cpu for right_shift based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for clip based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for cast based on highest priority (10)
INFO:te_compiler:Using injective.cpu for multiply based on highest priority (10)
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #0 tvmgen_default_fused_divide:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.50451 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #1 tvmgen_default_fused_round:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 7.58051 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #2 tvmgen_default_fused_cast:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.51303 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #3 tvmgen_default_fused_add:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.43396 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #4 tvmgen_default_fused_clip:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.31126 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #5 tvmgen_default_fused_cast_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.26805 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #6 tvmgen_default_fused_nn_dense:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 208.774 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #7 tvmgen_default_fused_add_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.35677 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #8 tvmgen_default_fused_cast_2:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.40485 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #9 tvmgen_default_fused_multiply:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.43153 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #10 tvmgen_default_fused_zeros:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.34377 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #11 tvmgen_default_fused_greater_equal:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.47746 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #12 tvmgen_default_fused_full:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 7.02311 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #13 tvmgen_default_fused_full_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.36255 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #14 tvmgen_default_fused_where:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.4464 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #15 tvmgen_default_fused_add_2:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.38288 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #16 tvmgen_default_fused_right_shift:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 6.8721 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #17 tvmgen_default_fused_cast_3:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.36248 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #18 tvmgen_default_fused_clip_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 6.75536 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #19 tvmgen_default_fused_cast_4:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.36654 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #20 tvmgen_default_fused_nn_relu:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.35928 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #21 tvmgen_default_fused_nn_dense_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 20.3317 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #22 tvmgen_default_fused_add_3:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.5146 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #23 tvmgen_default_fused_cast_5:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.28206 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #24 tvmgen_default_fused_multiply_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.27193 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #25 tvmgen_default_fused_zeros_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 8.462 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #26 tvmgen_default_fused_greater_equal_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 6.74615 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #27 tvmgen_default_fused_full_1_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.86906 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #28 tvmgen_default_fused_full_1_2:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.41259 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #29 tvmgen_default_fused_where_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.30519 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #30 tvmgen_default_fused_add_4:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.25219 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #31 tvmgen_default_fused_right_shift_1:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.27692 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #32 tvmgen_default_fused_cast_6:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 7.30225 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #33 tvmgen_default_fused_clip_2:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 6.70616 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #34 tvmgen_default_fused_cast_7:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.19553 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #35 tvmgen_default_fused_cast_8:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.38285 us/iter
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:65: Op #36 tvmgen_default_fused_multiply_2:
[23:22:47] /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:68: Iteration: 0: 5.35946 us/iter
INFO:root:running node=0, output_ind=0, with node_name: onnx::Gemm_0
INFO:root:running node=1, output_ind=0, with node_name: p0
INFO:root:running node=2, output_ind=0, with node_name: tvmgen_default_fused_divide
INFO:root:running node=3, output_ind=0, with node_name: tvmgen_default_fused_round
INFO:root:running node=4, output_ind=0, with node_name: p1
INFO:root:running node=5, output_ind=0, with node_name: tvmgen_default_fused_cast
INFO:root:running node=6, output_ind=0, with node_name: tvmgen_default_fused_add
INFO:root:running node=7, output_ind=0, with node_name: tvmgen_default_fused_clip
INFO:root:running node=8, output_ind=0, with node_name: tvmgen_default_fused_cast_1
INFO:root:running node=9, output_ind=0, with node_name: p2
INFO:root:running node=10, output_ind=0, with node_name: tvmgen_default_fused_nn_dense
INFO:root:running node=11, output_ind=0, with node_name: p3
INFO:root:running node=12, output_ind=0, with node_name: tvmgen_default_fused_add_1
INFO:root:running node=13, output_ind=0, with node_name: tvmgen_default_fused_cast_2
INFO:root:running node=14, output_ind=0, with node_name: p4
INFO:root:running node=15, output_ind=0, with node_name: tvmgen_default_fused_multiply
INFO:root:running node=16, output_ind=0, with node_name: tvmgen_default_fused_zeros
INFO:root:running node=17, output_ind=0, with node_name: tvmgen_default_fused_greater_equal
INFO:root:running node=18, output_ind=0, with node_name: p5
INFO:root:running node=19, output_ind=0, with node_name: tvmgen_default_fused_full
INFO:root:running node=20, output_ind=0, with node_name: p6
INFO:root:running node=21, output_ind=0, with node_name: tvmgen_default_fused_full_1
INFO:root:running node=22, output_ind=0, with node_name: tvmgen_default_fused_where
INFO:root:running node=23, output_ind=0, with node_name: tvmgen_default_fused_add_2
INFO:root:running node=24, output_ind=0, with node_name: p7
INFO:root:running node=25, output_ind=0, with node_name: tvmgen_default_fused_right_shift
INFO:root:running node=26, output_ind=0, with node_name: tvmgen_default_fused_cast_3
INFO:root:running node=27, output_ind=0, with node_name: tvmgen_default_fused_clip_1
INFO:root:running node=28, output_ind=0, with node_name: tvmgen_default_fused_cast_4
INFO:root:running node=29, output_ind=0, with node_name: tvmgen_default_fused_nn_relu
INFO:root:running node=30, output_ind=0, with node_name: p8
INFO:root:running node=31, output_ind=0, with node_name: tvmgen_default_fused_nn_dense_1
INFO:root:running node=32, output_ind=0, with node_name: p9
INFO:root:running node=33, output_ind=0, with node_name: tvmgen_default_fused_add_3
INFO:root:running node=34, output_ind=0, with node_name: tvmgen_default_fused_cast_5
INFO:root:running node=35, output_ind=0, with node_name: p10
INFO:root:running node=36, output_ind=0, with node_name: tvmgen_default_fused_multiply_1
INFO:root:running node=37, output_ind=0, with node_name: tvmgen_default_fused_zeros_1
INFO:root:running node=38, output_ind=0, with node_name: tvmgen_default_fused_greater_equal_1
INFO:root:running node=39, output_ind=0, with node_name: p11
INFO:root:running node=40, output_ind=0, with node_name: tvmgen_default_fused_full_1_1
INFO:root:running node=41, output_ind=0, with node_name: p12
INFO:root:running node=42, output_ind=0, with node_name: tvmgen_default_fused_full_1_2
INFO:root:running node=43, output_ind=0, with node_name: tvmgen_default_fused_where_1
INFO:root:running node=44, output_ind=0, with node_name: tvmgen_default_fused_add_4
INFO:root:running node=45, output_ind=0, with node_name: p13
INFO:root:running node=46, output_ind=0, with node_name: tvmgen_default_fused_right_shift_1
INFO:root:running node=47, output_ind=0, with node_name: tvmgen_default_fused_cast_6
INFO:root:running node=48, output_ind=0, with node_name: tvmgen_default_fused_clip_2
INFO:root:running node=49, output_ind=0, with node_name: tvmgen_default_fused_cast_7
INFO:root:running node=50, output_ind=0, with node_name: tvmgen_default_fused_cast_8
INFO:root:running node=51, output_ind=0, with node_name: p14
INFO:root:running node=52, output_ind=0, with node_name: tvmgen_default_fused_multiply_2
Node Name                             Ops                                   Time(us)  Time(%)  Shape     Inputs  Outputs  Measurements(us)  
---------                             ---                                   --------  -------  -----     ------  -------  ----------------  
tvmgen_default_fused_nn_dense         tvmgen_default_fused_nn_dense         208.774   48.327   (1, 256)  2       1        [208.774]         
tvmgen_default_fused_nn_dense_1       tvmgen_default_fused_nn_dense_1       20.332    4.706    (1, 10)   2       1        [20.332]          
tvmgen_default_fused_zeros_1          tvmgen_default_fused_zeros_1          8.462     1.959    (1, 10)   0       1        [8.462]           
tvmgen_default_fused_round            tvmgen_default_fused_round            7.581     1.755    (1, 784)  1       1        [7.581]           
tvmgen_default_fused_cast_6           tvmgen_default_fused_cast_6           7.302     1.69     (1, 10)   1       1        [7.302]           
tvmgen_default_fused_full             tvmgen_default_fused_full             7.023     1.626    (1, 256)  1       1        [7.023]           
tvmgen_default_fused_right_shift      tvmgen_default_fused_right_shift      6.872     1.591    (1, 256)  2       1        [6.872]           
tvmgen_default_fused_clip_1           tvmgen_default_fused_clip_1           6.755     1.564    (1, 256)  1       1        [6.755]           
tvmgen_default_fused_greater_equal_1  tvmgen_default_fused_greater_equal_1  6.746     1.562    (1, 10)   2       1        [6.746]           
tvmgen_default_fused_clip_2           tvmgen_default_fused_clip_2           6.706     1.552    (1, 10)   1       1        [6.706]           
tvmgen_default_fused_full_1_1         tvmgen_default_fused_full_1           5.869     1.359    (1, 10)   1       1        [5.869]           
tvmgen_default_fused_add_3            tvmgen_default_fused_add_3            5.515     1.277    (1, 10)   2       1        [5.515]           
tvmgen_default_fused_cast             tvmgen_default_fused_cast             5.513     1.276    ()        1       1        [5.513]           
tvmgen_default_fused_divide           tvmgen_default_fused_divide           5.505     1.274    (1, 784)  2       1        [5.505]           
tvmgen_default_fused_greater_equal    tvmgen_default_fused_greater_equal    5.477     1.268    (1, 256)  2       1        [5.477]           
tvmgen_default_fused_where            tvmgen_default_fused_where            5.446     1.261    (1, 256)  3       1        [5.446]           
tvmgen_default_fused_add              tvmgen_default_fused_add              5.434     1.258    (1, 784)  2       1        [5.434]           
tvmgen_default_fused_multiply         tvmgen_default_fused_multiply         5.432     1.257    (1, 256)  2       1        [5.432]           
tvmgen_default_fused_full_1_2         tvmgen_default_fused_full_1           5.413     1.253    (1, 10)   1       1        [5.413]           
tvmgen_default_fused_cast_2           tvmgen_default_fused_cast_2           5.405     1.251    (1, 256)  1       1        [5.405]           
tvmgen_default_fused_add_2            tvmgen_default_fused_add_2            5.383     1.246    (1, 256)  2       1        [5.383]           
tvmgen_default_fused_cast_8           tvmgen_default_fused_cast_8           5.383     1.246    (1, 10)   1       1        [5.383]           
tvmgen_default_fused_cast_4           tvmgen_default_fused_cast_4           5.367     1.242    (1, 256)  1       1        [5.367]           
tvmgen_default_fused_full_1           tvmgen_default_fused_full             5.363     1.241    (1, 256)  1       1        [5.363]           
tvmgen_default_fused_cast_3           tvmgen_default_fused_cast_3           5.362     1.241    (1, 256)  1       1        [5.362]           
tvmgen_default_fused_nn_relu          tvmgen_default_fused_nn_relu          5.359     1.241    (1, 256)  1       1        [5.359]           
tvmgen_default_fused_multiply_2       tvmgen_default_fused_multiply_2       5.359     1.241    (1, 10)   2       1        [5.359]           
tvmgen_default_fused_add_1            tvmgen_default_fused_add_1            5.357     1.24     (1, 256)  2       1        [5.357]           
tvmgen_default_fused_zeros            tvmgen_default_fused_zeros            5.344     1.237    (1, 256)  0       1        [5.344]           
tvmgen_default_fused_clip             tvmgen_default_fused_clip             5.311     1.229    (1, 784)  1       1        [5.311]           
tvmgen_default_fused_where_1          tvmgen_default_fused_where_1          5.305     1.228    (1, 10)   3       1        [5.305]           
tvmgen_default_fused_cast_5           tvmgen_default_fused_cast_5           5.282     1.223    (1, 10)   1       1        [5.282]           
tvmgen_default_fused_right_shift_1    tvmgen_default_fused_right_shift_1    5.277     1.222    (1, 10)   2       1        [5.277]           
tvmgen_default_fused_multiply_1       tvmgen_default_fused_multiply_1       5.272     1.22     (1, 10)   2       1        [5.272]           
tvmgen_default_fused_cast_1           tvmgen_default_fused_cast_1           5.268     1.219    (1, 784)  1       1        [5.268]           
tvmgen_default_fused_add_4            tvmgen_default_fused_add_4            5.252     1.216    (1, 10)   2       1        [5.252]           
tvmgen_default_fused_cast_7           tvmgen_default_fused_cast_7           5.196     1.203    (1, 10)   1       1        [5.196]           
Total_time                            -                                     432.001   -        -         -       -        -                 
[[ -6.5  -6.5   0.5   8.5 -19.5  12.5 -12.   -3.   -6.   -3.5]]
Traceback (most recent call last):
  File "esp-dl/tools/tvm/export_onnx_model.py", line 241, in <module>
    debug_onnx_model(args.target_chip, args.model_path, args.img_path)
  File "esp-dl/tools/tvm/export_onnx_model.py", line 221, in debug_onnx_model
    m.debug_get_output("tvmgen_default_fused_nn_relu", tvm_out)
  File "/content/esp-dl/tools/tvm/python/tvm/contrib/debugger/debug_executor.py", line 276, in debug_get_output
    self._debug_get_output(node_index, out)
  File "/content/esp-dl/tools/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 238, in __call__
    raise get_last_ffi_error()
tvm.error.InternalError: Traceback (most recent call last):
  35: 0xffffffffffffffff
  34: _start
  33: __libc_start_main
  32: 0x00007978897a4d8f
  31: _Py_UnixMain
  30: 0x000000000057b93a
  29: PyRun_SimpleFileExFlags
  28: PyRun_FileExFlags
  27: 0x00000000005a4726
  26: PyEval_EvalCode
  25: _PyEval_EvalCodeWithName
  24: _PyEval_EvalFrameDefault
  23: 0x00000000004de313
  22: _PyFunction_FastCallKeywords
  21: _PyEval_EvalFrameDefault
  20: 0x00000000004de313
  19: _PyFunction_FastCallKeywords
  18: _PyEval_EvalCodeWithName
  17: _PyEval_EvalFrameDefault
  16: 0x00000000004de44c
  15: _PyObject_FastCallKeywords
  14: 0x00000000005abdaa
  13: _PyObject_Call_Prepend
  12: _PyFunction_FastCallDict
  11: _PyEval_EvalCodeWithName
  10: _PyEval_EvalFrameDefault
  9: 0x00000000004de44c
  8: _PyObject_FastCallKeywords
  7: 0x00007978893430d6
  6: 0x0000797889339eae
  5: 0x0000797888977492
  4: 0x000079788897ae2d
  3: operator()
        at /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:203
  2: tvm::runtime::GraphExecutorDebug::DebugGetNodeOutput(int, DLTensor*)
        at /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/graph_executor/debug/graph_executor_debug.cc:325
  1: tvm::runtime::NDArray::CopyTo(DLTensor*) const
        at /home/gansichen/Workspace/projects/local/framework/tvm/include/tvm/runtime/ndarray.h:392
  0: tvm::runtime::NDArray::CopyFromTo(DLTensor const*, DLTensor*, void*)
        at /home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/ndarray.cc:293
  File "/home/gansichen/Workspace/projects/local/framework/tvm/src/runtime/ndarray.cc", line 293
InternalError: Check failed: from_size == to_size (256 vs. 16384) : TVMArrayCopyFromTo: The size must exactly match

It looks like it's failing the debug step, and I've been unable to track down exactly where.

@noorhaq
Copy link

noorhaq commented Oct 6, 2023

Not sure why it was failing for you.
But here's my notebook working https://colab.research.google.com/gist/noorhaq/926bbbacecf0473f598cdd386668f763/pytorch-mnist-onnx-quantization.ipynb#scrollTo=q5NfoEMMqx0R
If you require any assistance in using the generated project let me know and I will share it with you

@ShawnHymel
Copy link
Author

Hi @noorhaq,

Thanks for checking it out! The notebook does indeed run, but the model fails the debugging check step. You have to uncomment the following line in esp-dl/tools/tvm/export-onnx-model.py to see the failure:

debug_onnx_model(args.target_chip, args.model_path, args.img_path)

@Auroragan
Copy link

Hi,
'generated project in: ./esp32-inference-project/new_project' shows that your conversion is succeed.
autotvm is not used in this script and is not supported for esp boards yet. I think 'Cannot find tuning records' is just an internal debug info suggesting default strategy will be used not the ones that autotvm found.

@ShawnHymel
Copy link
Author

Hi @Auroragan,

Good to know, thank you! I'll close out the issue.

@MichelBerg
Copy link

Currently, I am working with ESP-DL TVM and I got the same issue while uncommenting the debug function.

The “error” message:
InternalError: Check failed: from_size == to_size (256 vs. 16384) : TVMArrayCopyFromTo: The size must exactly match

is created because of the debug function.

It looks like the example is based on the following model from ESP-DL tutorial https://github.com/espressif/esp-dl/blob/master/tutorial/tvm_example/model.onnx
I was curious about it, so I looked up the model architecture by using Netron:
debug_example

Due to that @ShawnHymel, you got that error with the debug output. If you use the example model, you won't get that error. You could also change the output of any layer you want to read, that is specified by your own model. Of course you have to adjust the tvm_out shape as well as the name of the layer.

I suggest removing this code or put it in another example. It only leads to confusion in my opinion.
https://github.com/espressif/esp-dl/blob/19d377eb47230bc101fc159086b0cabbdf6f17b0/tools/tvm/export_onnx_model.py#L219C4-L222C43

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants