Numerical issue with chatglm2.vmfb model #15661

manishghop · 2023-11-21T17:07:48Z

What happened?

I'm able to compile the pytorch model into mlir & then convert the mlir model into vmfb file:
I used this code for compilation : https://gist.github.com/manishghop/55c741b5734b6f3fb041111a4b9be695

But while running the inference I get NaN error:

I used this code to run the inference : https://gist.github.com/manishghop/529225d5e7e609b679f53fc4272be05c

Steps to reproduce your issue

git clone https://github.com/nod-ai/SHARK.git
cd SHARK
Run the following in Powershell
3.1. set-executionpolicy remotesigned
3.2. Run the setup_venv.ps1 from: https://github.com/nod-ai/SHARK

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

stellaraccident · 2023-11-28T16:41:35Z

@jinchen62 can't assign because not in org but this is what we were discussing you engaging on

AmosLewis · 2023-12-06T22:46:47Z

Related issue #15665

AmosLewis · 2023-12-11T18:32:53Z

To repeat this error in binary:

iree-run-module \
    --device=local-task \
    --module="/nodclouddata/chi/src/SHARK/chatglm.vmfb" \
    --function=forward \
    --input="1x4xi64=0"

EXEC @forward
result[0]: hal.buffer_view
1x4x65024xf16=[[NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN...][...][...][...]]

AmosLewis · 2023-12-12T21:28:17Z

Further Debug Step:

iree-compiler             20231211.611
iree-runtime              20231211.611

The original 6.4G chatglm-6b-int4.mlir
To generate all the dispatch.mlir for debugging, change the chatglm.py line 170 with

path = shark_module.save_module(
    "./",
    "chatglm",
    extra_args=["--iree-hal-dump-executable-sources-to=/nodclouddata/chi/src/SHARK/nan/dispatch/2"],
    debug=debug,
)

Then run python chatglm.py
OR:
/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=/nodclouddata/chi/src/SHARK/nan/dispatch/2/tmp/core-reproducer.mlir --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true --iree-hal-dump-executable-sources-to=/nodclouddata/chi/src/SHARK/nan/dispatch/2

After run each dispatch, the NAN issue first come out with module_forward_dispatch_9.mlir
To generate the chatglm.vmfb to run module_forward_dispatch_9.mlir

path = shark_module.save_module(
    "./",
    "chatglm",
    extra_args=["--iree-flow-break-dispatch=@forward:9"],
    debug=debug,
)

Then run python chatglm.py
Or
/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=/nodclouddata/chi/src/SHARK/nan/dispatch/2/tmp/core-reproducer.mlir --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true --iree-flow-break-dispatch=@forward:9
To get the 518.3 MB chatglm.vmfb

To run the chatglm.vmfb for module_forward_dispatch_9.mlir, change the chatglm.py line 170 with

iree-run-module \    
    --device=local-task \
    --module="/nodclouddata/chi/src/SHARK/chatglm.vmfb" \
    --function=forward \
    --input="1x4xi64=1"

The output is

EXEC @forward
result[0]: hal.buffer_view
4x32x32xf16=[[-NAN 0.142822 0.132812 0.201294 0.0273285 -0.0444946 0.0615845 0.0222778 0.187256 -0.0736694 0.178589 0.141602 0.267578 -0.179321 -0.114014 0.188354 0.59375 0.647949 -0.217773 -0.503906 -0.388672 0.281006 0.0260773 1.03125 -0.384766 -0.44751 -0.357178 0.78125 1.03125 -0.484375 -0.445557 0.597656][-NAN 0.0149841 -0.202393 -0.0646973 -0.0947876 0.0257111 0.152954 -0.312988 0.164185 -0.301025 0.102722 0.269531 0.29541 0.0250854 -0.601562 0.660156 0.949219 0.508301 0.030426 -1.25781 0.320068 0.363281 -0.443359 1.88281 0.0531006 -0.734375 0.106262 0.765625 0.984375 -0.757812 -0.679688 0.396484][-NAN -0.377441 -0.0886841 -0.143555 0.0489197 -0.0198364 0.0590515 0.121643 -0.0209961 -0.157227 0.27124 0.306641 0.174561 -0.405762 -0.613281 0.531738 1.05469 0.535156 0.131714 -1.2959 0.702637 0.287354 -0.252441 2.01562 -0.169434 -0.474609 -0.255859 0.972656 1.05469 -0.734375 -0.722656 0.566406][-NAN 0.406738 0.384521 0.460938 -0.2229 0.0623779 0.00845337 0.126343 -0.142822 -0.105225 -0.104553 0.192627 0.00684357 -0.427246 0.0197601 -0.00178814 0.480713 0.574219 -0.322266 -0.554688 -0.0859985 0.0394897 0.304932 1.54688 -0.246704 -0.396484 -0.257568 0.910156 0.914551 -0.65332 -0.535645 0.605957][-NAN -0.418457 -0.402832 -0.480713 0.0939941 -0.0409241 0.30249 -0.0563049 0.15918 -0.0656738 0.209595 0.157959 0.251953 0.0300598 -0.205811 0.325928 0.308594 0.347656 0.33374 -0.671875 0.43457 0.663086 -0.179443 0.941406 0.197021 -0.231445 -0.310303 0.406006 0.746094 -0.277344 -0.31665 0.263916][-NAN 0.108398 0.976562 0.992188 -0.707031 -0.459229 -0.108398 1.21875 0.566406 1.47656 0.648438 0.671875 0.929688 0.65625 -1.15625 0.376953 0.960938 -0.263672 1.04688 -1.27344 0.28125 0.0374451 -0.219727 1.57812 0.455078 -0.566406 0.194458 0.382812 0.761719 -0.28125 -0.71875 0.18457][-NAN 0.218872 0.198364 0.351562 -0.138916 0.194336 0.00869751 -0.0662231 0.174927 -0.143433 0.121643 0.145752 0.0852051 -0.3396 -0.230835 0.3125 0.400635 0.507812 -0.310547 -0.617188 0.190308 -0.125977 0.181763 1.53906 -0.417969 -0.363037 -0.118103 1 0.707031 -0.621094 -0.65625 0.442871][-NAN 0.242432 -0.172485 -0.0402527 -0.158447 -0.0847778 0.19043 0.000880241 0.300537 -0.0539551 0.269775 0.0354309 0.258057 -0.121643 -0.163818 0.256104 0.828613 0.535156 -0.0917358 -0.554688 0.0227509 0.523438 -0.273438 0.672363 0.153564 -0.202271 -0.429688 0.463379 0.737793 -0.60498 -0.177979 0.758301][-NAN 0.16687 -0.049469 0.0241852 -0.168091 -1.07812 0.953125 1.00781 0.984375 0.851562 1.375 1.01562 1.4375 1.32812 -1.49219 0.667969 1.23438 -0.408203 1.47656 -2.23438 1.17188 0.65625 -0.546875 1.6875 0.976562 -0.550781 1.15625 -0.126953 1 -0.0634766 -1.09375 0.0656738][-NAN 0.0722046 -0.0282593 -0.0578003 0.0189056 -0.0503235 -0.0721436 -0.245239 -0.0542908 -0.30835 0.0949707 0.0890503 0.503418 -0.294922 -0.0880737 -0.102173 0.546387 0.570312 -0.406494 -0.66748 -0.461182 -0.0996704 0.20105 1.55469 -0.293213 -0.543457 -0.464844 0.964844 0.753418 -0.691406 -0.726562 0.550781][-NAN 0.0623474 -0.0394592 -0.0491028 -0.109253 -0.00532532 -0.00837708 -0.0166473 0.0957642 -0.0270538 0.253906 0.443115 0.476562 0.251953 -0.66748 0.351562 1.03125 0.245117 0.32373 -0.913086 0.651855 0.339844 -0.0430298 1.13965 0.253662 -0.443115 -0.400391 0.715332 0.871094 -0.765625 -0.373047 0.976074][-NAN -0.107117 -0.0581665 -0.162354 -0.0953979 -0.192139 0.2771 -0.173218 0.613281 0.0386353 0.486084 0.209839 0.773438 -0.156982 -0.0804443 0.19519 0.91748 0.921875 -0.349854 -0.714355 -0.102722 -0.243286 0.0998535 2.09375 -0.378662 -0.156128 -0.394287 0.515625 0.354004 -0.265869 -0.277588 0.491211][-NAN -0.155396 -0.106873 -0.126221 0.0148773 0.0980835 -0.146851 -0.0118637 0.009552 0.078125 0.221802 -0.013031 -0.110718 -0.176758 -0.128174 0.153442 0.652344 0.566406 -0.490479 -0.496094 0.0133133 -0.204956 -0.100708 1.32031 -0.809082 -0.421631 -0.398682 1.11719 0.663574 -0.898438 -0.554199 0.617188][-NAN -0.137329 0.0624084 0.424072 0.0146179 -1.21875 0.0838623 1.03223 1.51562 1.05566 1.14844 0.859375 1.46094 0.404297 -1.5 0.289307 1.39062 0.367676 1.96875 -1.41406 0.628418 0.296875 -0.174927 1.4375 0.999512 -0.0878906 -0.0773315 0.169678 1.20312 -0.226929 -0.371094 0.211792][-NAN 0.209229 0.213989 0.291016 -0.0252075 0.0462341 -0.00183678 0.0205231 0.103271 -0.152344 0.106689 -0.00227547 0.0877686 -0.335449 0.100769 0.0362244 0.308838 0.496338 -0.285156 -0.449463 -0.365479 0.0715942 0.10907 1.07812 -0.0349426 -0.400146 -0.386719 0.898438 0.839844 -0.542969 -0.5 0.734375][-NAN 0.158203 0.113281 0.0916748 -0.314697 -0.23645 0.371338 0.273682 0.29126 0.233398 0.342041 0.120728 0.359375 0.135864 -0.281494 0.0345154 0.6875 0.695312 -0.478271 -0.777344 -0.150269 0.0703125 0.257812 1 -0.39624 -0.28125 -0.394531 0.808105 1.17188 -0.718262 -0.427734 0.629395][-NAN 0.0286102 0.0839233 -0.0317993 0.0648804 0.0410461 -0.119385 -0.0699463 -0.0432739 -0.0532837 -0.0927734 0.102051 -0.0171661 -0.0993652 -0.523438 0.0148392 -0.135376 -0.485107 -0.266357 -0.870117 -1.28906 1.11035 -1.53906 -1.54785 0.203857 -0.368896 -1.01562 1.50879 1.21777 1.34375 -0.326904 0.0969849][-NAN 0.0119705 0.15625 -0.078125 0.0838623 0.117615 0.0111084 0.0110474 0.114685 -0.0796509 -0.0977783 -0.0163574 -0.0653076 -0.126831 -0.0888672 0.347656 -0.0478821 0.196533 0.474609 0.0300446 0.259277 -0.0598145 -0.271484 -0.63623 0.551758 -1.09375 -0.999512 0.929688 0.520508 1.22559 0.395752 0.431396][-NAN 0.0478516 0.140869 -0.114929 0.174805 0.211914 -0.0474243 -0.017868 0.0183258 -0.271484 -0.0900269 -0.0342712 -0.476318 0.213745 0.401367 0.0577393 0.635742 1.01562 -0.858398 0.175293 0.157837 1.2793 0.0521545 -1.16309 0.342041 -0.451904 -0.809082 1.14746 0.90625 0.898438 0.476318 0.0497742][-NAN 0.0640259 0.0469971 -0.10199 0.17395 -0.0628052 0.169678 0.0995483 -0.0897217 -0.0575867 0.488525 -0.302246 -0.0686646 0.427246 0.589355 -0.454834 0.939941 1.02246 -1.1709 0.296387 -0.130249 1.66309 0.376465 -1.26367 0.287842 -0.722656 -0.776855 1.31934 1.125 0.648438 -0.0335388 -0.0169678][-NAN -0.00254059 0.0898438 -0.0976562 0.28125 -0.069519 -0.115906 0.214966 0.190796 0.26416 0.0657959 0.394043 -0.27124 -0.190796 -0.776855 -0.547852 0.22998 1.0625 -1.14062 0.730469 0.967773 0.613281 -0.394287 -1.19531 -0.0442505 -0.391602 -0.119019 1.125 -0.135498 1.59277 0.535645 0.895508][-NAN -0.0597229 -0.071167 0.0662231 -0.0159149 -0.133789 0.227051 0.224487 0.36499 0.00257683 -0.111328 -0.680176 -0.569824 1.18066 0.566895 -2.87305 -0.145508 1.56348 -2.5 -0.0403442 1.41406 2.31055 0.550293 -1.30469 -0.894531 -0.69873 -1.20215 1.96777 -0.270996 1.58594 1.05371 1.0625][-NAN 0.324951 0.400879 -0.443359 0.0141144 0.854004 -0.370117 -0.634766 -1.17969 -0.693359 -1.47168 -0.571777 0.0400085 1.77441 0.83252 0.213135 2.2207 1.91602 -1.74219 1.85938 -0.00273514 3.35742 0.19751 -1.52441 -1.39062 0.973145 0.28833 1.13281 -0.520508 0.650879 0.53125 0.70166][-NAN 0.447266 0.277344 -0.310303 -0.0130539 0.466797 -0.240845 -0.660156 -0.695312 -0.390625 -1.30469 -0.131836 0.071106 0.244385 -0.296875 0.757324 1.17871 0.486328 -0.820312 0.683105 -0.526367 1.99121 0.130249 -1.24121 -0.4375 0.0282288 -0.688477 1.16406 0.593262 0.910156 -0.0026207 0.203125][-NAN -0.0357056 0.0184784 -0.0187378 0.0130081 -0.0167542 -0.0133667 -0.0304565 0.0339355 0.0802612 0.0080719 -0.25415 -0.120972 0.373535 0.292236 -0.152222 0.428711 0.916992 -0.752441 0.453125 0.504395 0.974121 0.253906 -1.01367 0.0643921 -0.903809 -1.17188 1.40527 0.62207 1.11621 0.479248 0.315674][-NAN 0.0445557 0.259766 -0.143677 0.180664 0.206177 -0.279053 0.078125 0.129028 -0.233643 0.380859 0.255859 -1.125 -0.59668 1.29688 -0.35376 0.746582 1.75684 -1.10938 0.652344 0.0236053 2.10938 0.203125 -1.55469 -0.265869 -0.644531 -0.566895 0.973145 0.291016 0.996582 0.297119 0.574707][-NAN -0.297119 -0.030365 0.026535 0.0256805 0.0462036 -0.316162 -0.402588 0.194458 -0.162231 0.166016 0.32251 0.0821533 -0.155151 0.343994 -0.206421 0.151733 0.558594 -0.289062 -0.202026 -0.00947571 0.878418 -0.558105 -0.788574 0.519531 -0.186401 -0.182739 0.749512 0.341797 0.54248 0.0967407 -0.0722656][-NAN -0.041748 -0.00181484 -0.0288239 0.0518799 0.0551147 0.00549316 0.0603943 -0.00839996 0.123169 -0.226807 -0.19812 -0.0264587 0.171021 0.229248 -0.0961304 0.572754 1.03809 -0.81543 0.386963 0.70752 0.901367 0.213745 -1.0459 0.0230408 -0.837402 -0.825195 1.46875 0.770996 1.15625 0.296387 0.325928][-NAN 0.00265312 0.429688 0.269531 -0.367188 0.609375 -0.0617676 -1.53125 1.35938 -0.357422 -1.375 -0.996094 -0.314209 0.212036 1.23438 0.161865 1.64844 1.75 -1.49219 0.625 0.110413 2.1875 0.318359 -1.09375 -0.294922 0.171143 0.0635986 0.621094 0.199341 0.302734 0.134766 -0.0419617][-NAN -0.198364 -0.183716 0.275391 -0.0667114 -0.0384216 -0.116394 -0.30127 -0.742188 -0.457031 -0.045105 -0.217896 -0.213135 2.15625 0.220825 0.186646 2.90625 0.929688 -1.62402 1.60059 0.0589294 3.01367 0.414307 -1.74902 -0.890137 1.03809 -0.362793 0.901367 -0.253418 1.17188 0.325684 0.621094][-NAN -0.00503922 -0.0176392 -0.0256348 0.00799561 0.0133362 -0.0719604 -0.0431213 0.0621338 0.0771484 -0.192505 0.00991821 0.0211182 0.0186462 0.134644 -0.188232 0.0141068 -0.0140839 -0.167969 -0.714355 0.582031 -0.0226593 0.175415 -0.542969 0.526855 -0.784668 -0.745605 0.664062 1.34375 0.859375 0.0915527 0.0249634][-NAN 0.00349426 -0.0563354 0.0125809 0.0175171 -0.00355339 -0.0689697 0.0123138 0.0610046 -0.0119858 -0.09375 -0.040863 -0.0586548 0.124023 0.381348 -0.253906 0.0563354 0.566895 -0.797363 -0.787598 0.295898 0.540039 0.141357 -1.19629 0.1875 -1.1875 -0.949219 1.31348 1.43555 1.19531 -0.0949707 0.206665]][[...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...]][[...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...]][[...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...][...]]

In https://gist.github.com/manishghop/529225d5e7e609b679f53fc4272be05c

print("inputsid: ")
print(input_ids)  # tensor([[64790, 64792, 36474, 54591]]) torch.Size([1, 4]

@hanhanW Could you provide some guidance on what's going on in dispatch_9 and where to fix in IREE?

hanhanW · 2023-12-12T22:26:18Z

Can you help untangle the issue from SHARK? I think we need a simpler repro. The first step could be uploading the MLIR file to somewhere and attach a link to the issue.

The next step is that you can pass --iree-hal-dump-executable-sources-to=/tmp/dump to iree-compile. It will dump executables to the path, and please attach the dispatch_9.mlir to the issue. That will give us a minor repro which is codegen's input.

The input seems to be critical in this issue. So the next step is to generate inputs for the smaller repro. You can follow the tips to get the smaller reproducer.

Note that it will print many values to stderr during executaion if we pass --iree-flow-trace-dispatch-tensors to iree-compile. You will want to dump them to a text file. Then you can search NAN in the log and we will get a smaller repro.

Feel free to ping me if you run into any issues.

AmosLewis · 2023-12-13T18:37:10Z

@hanhanW Werid, I run the prebuild binary successfully this morning.

iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=/nodclouddata/chi/src/SHARK/nan/dispatch/2/tmp/core-reproducer.mlir --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true --iree-flow-break-dispatch=@forward:9 --iree-flow-trace-dispatch-tensors -o /tmp/chatglm9-dispatch-tensors.vmfb
The terminal output: forwar9-dispatch-tensors.txt
The chatglm9-dispatch-tensors.vmfb

iree-run-module \
    --device=local-task \
    --module="/tmp/chatglm9-dispatch-tensors.vmfb" \
    --function=forward \
    --input="1x4xi64=1"

The iree-run-module stop here:

=== forward_dispatch_4::forward_dispatch_4_generic_4x4608x64x64_f16 inputs ===
OUT_OF_RANGE; while invoking native function hal.buffer_view.trace; while calling import; 
[ 1]   native hal.buffer_view.trace:0 -
[ 0] bytecode module@0:4402 -; invoking function 'forward'; `sync func @forward(%input0: tensor<1x4xi64>) -> (%output0: tensor<1x4x65024xf16>)`

hanhanW · 2023-12-13T19:23:16Z

I am seeing the error at 6a60b64:

❯ build/tools/iree-opt ~/chatglm-6b-int4.mlir
/home/hanchung/chatglm-6b-int4.mlir:0:0: error: attempting to parse a byte at the end of the bytecode
/home/hanchung/chatglm-6b-int4.mlir:0:0: note: in bytecode version 6 produced by: MLIR18.0.0git

It looks like we need to regenerate the mlir file?

hanhanW · 2023-12-13T19:28:55Z

The terminal output: forwar9-dispatch-tensors.txt

Based on the log, I think we const-eval a NAN and it becomes an input. So the issue could be at the other dispatch.

=== jit_eval_0_dispatch_0::jit_eval_0_dispatch_0_generic_32_f16 inputs ===
32xf16=0 0.03125 0.0625 0.09375 0.125 0.15625 0.1875 0.21875 0.25 0.28125 0.3125 0.34375 0.375 0.40625 0.4375 0.46875 0.5 0.53125 0.5625 0.59375 0.625 0.65625 0.6875 0.71875 0.75 0.78125 0.8125 0.84375 0.875 0.90625 0.9375 0.96875

=== jit_eval_0_dispatch_0::jit_eval_0_dispatch_0_generic_32_f16 outputs ===
32xf16=-NAN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Are you able to get the dispatch? I think it will show up if you pass -mlir-print-ir-after=iree-flow-annotate-dispatches and -mlir-elide-elementsattrs-if-larger=4 to the iree-compile. Can you help extract the dispatch from the log?

AmosLewis · 2023-12-13T19:59:52Z

I am seeing the error at 6a60b64:

❯ build/tools/iree-opt ~/chatglm-6b-int4.mlir
/home/hanchung/chatglm-6b-int4.mlir:0:0: error: attempting to parse a byte at the end of the bytecode
/home/hanchung/chatglm-6b-int4.mlir:0:0: note: in bytecode version 6 produced by: MLIR18.0.0git

It looks like we need to regenerate the mlir file?

I try to rerun the chatglm.py with nothing change. It shows the same issue we came across yesterday. Could you download and run it? It should generate the mlir quickly compared to I run it >> download to local system >> upload to google bucket >> you download/upload it again to your vm.

(shark.venv) ➜  SHARK git:(main) ✗ python nan/chatglm.py
........
[DEBUG] Compiling torchscript graph
[DEBUG] Lowering Torch -> Linalg
[DEBUG] Successfully Generated mlir on device
[DEBUG] converting to bytecode
Saved falcon mlir at  chatglm-6b-int4.mlir
Compiling for device : cpu-task
Configuring for device:cpu-task
Target triple found:x86_64-linux-gnu
Traceback (most recent call last):
  File "/nodclouddata/chi/src/SHARK/nan/chatglm.py", line 170, in <module>
    path = shark_module.save_module(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nodclouddata/chi/src/SHARK/shark/shark_inference.py", line 213, in save_module
    return export_iree_module_to_vmfb(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nodclouddata/chi/src/SHARK/shark/iree_utils/compile_utils.py", line 554, in export_iree_module_to_vmfb
    flatbuffer_blob = compile_module_to_flatbuffer(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nodclouddata/chi/src/SHARK/shark/iree_utils/compile_utils.py", line 338, in compile_module_to_flatbuffer
    flatbuffer_blob = ireec.compile_file(
                      ^^^^^^^^^^^^^^^^^^^
  File "/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/core.py", line 257, in compile_file
    result = invoke_immediate(cl)
             ^^^^^^^^^^^^^^^^^^^^
  File "/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/binaries.py", line 200, in invoke_immediate
    raise CompilerToolError(process)
iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
Error code: -11
Diagnostics:
Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: /nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true
 #0 0x00007f7dd755fc27 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x50f3c27)
 #1 0x00007f7dd755d96e llvm::sys::RunSignalHandlers() (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x50f196e)
 #2 0x00007f7dd75602ef SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f7dd245d420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007f7dd85e8531 mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::reassociateDequantMatmul(mlir::RewriterBase&, mlir::linalg::GenericOp, mlir::linalg::GenericOp, int) FuseDequantizationMatmul.cpp:0:0
 #5 0x00007f7dd85e4cb6 mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::FuseDequantizationMatmulPass::runOnOperation() FuseDequantizationMatmul.cpp:0:0
 #6 0x00007f7dd76e8cf9 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x527ccf9)
 #7 0x00007f7dd76e96d8 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x527d6d8)
 #8 0x00007f7dd76eb456 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x527f456)
 #9 0x00007f7dd76e8eec mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x527ceec)
#10 0x00007f7dd76ec7ea mlir::PassManager::run(mlir::Operation*) (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x52807ea)
#11 0x00007f7dd74b8ee9 ireeCompilerInvocationPipeline (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/_mlir_libs/libIREECompiler.so+0x504cee9)
#12 0x00007f7dd76b12da mlir::iree_compiler::runIreecMain(int, char**)::$_2::operator()(iree_compiler_source_t*) const iree_compile_lib.cc:0:0
#13 0x00007f7dd76b0b97 mlir::iree_compiler::runIreecMain(int, char**) iree_compile_lib.cc:0:0
#14 0x00007f7dd227b083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#15 0x000000000020177e _start (/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile+0x20177e)


Invoked with:
 iree-compile /nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/nodclouddata/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

AmosLewis · 2023-12-13T20:05:57Z

I am seeing the error at 6a60b64:

❯ build/tools/iree-opt ~/chatglm-6b-int4.mlir
/home/hanchung/chatglm-6b-int4.mlir:0:0: error: attempting to parse a byte at the end of the bytecode
/home/hanchung/chatglm-6b-int4.mlir:0:0: note: in bytecode version 6 produced by: MLIR18.0.0git

It looks like we need to regenerate the mlir file?

I just look the chatglm.py code, the mlir is directly generated and saved by torch_mlir.compile. It shouldn't change in different run.

AmosLewis · 2023-12-13T20:14:58Z

The terminal output: forwar9-dispatch-tensors.txt

Based on the log, I think we const-eval a NAN and it becomes an input. So the issue could be at the other dispatch.
=== jit_eval_0_dispatch_0::jit_eval_0_dispatch_0_generic_32_f16 inputs ===
32xf16=0 0.03125 0.0625 0.09375 0.125 0.15625 0.1875 0.21875 0.25 0.28125 0.3125 0.34375 0.375 0.40625 0.4375 0.46875 0.5 0.53125 0.5625 0.59375 0.625 0.65625 0.6875 0.71875 0.75 0.78125 0.8125 0.84375 0.875 0.90625 0.9375 0.96875

=== jit_eval_0_dispatch_0::jit_eval_0_dispatch_0_generic_32_f16 outputs ===
32xf16=-NAN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Are you able to get the dispatch? I think it will show up if you pass -mlir-print-ir-after=iree-flow-annotate-dispatches and -mlir-elide-elementsattrs-if-larger=4 to the iree-compile. Can you help extract the dispatch from the log?

Here you go. I also give the cmd I run chatglm_dispatch.mlir

iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=/nodclouddata/chi/src/SHARK/nan/dispatch/2/tmp/core-reproducer.mlir --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true --iree-flow-break-dispatch=@forward:9 --iree-flow-trace-dispatch-tensors -mlir-print-ir-after=iree-flow-annotate-dispatches -mlir-elide-elementsattrs-if-larger=4 -o /tmp/chatglm9.vmfb

Debug steps with this info:

Manually search jit_eval_0_dispatch_0_generic_32_f16 in chatglm_dispatch.mlir to locate the bug code.
Manually create a mlir with the bug code.

builtin.module {
      func.func @jit_eval_0_dispatch_0_generic_32_f16(%arg0: !flow.dispatch.tensor<readonly:tensor<32xf16>> loc("aten::reciprocal"("<eval_with_key>.5":11:17)), %arg1: !flow.dispatch.tensor<writeonly:tensor<32xf16>> loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))) {
        %cst = arith.constant 1.000000e+04 : f16 loc(callsite("aten::pow"("<eval_with_key>.5":10:12) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        %cst_0 = arith.constant 0.000000e+00 : f16 loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        %cst_1 = arith.constant 1.000000e+00 : f16 loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        %0 = flow.dispatch.tensor.load %arg0, offsets = [0], sizes = [32], strides = [1] : !flow.dispatch.tensor<readonly:tensor<32xf16>> -> tensor<32xf16> loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        %1 = tensor.empty() : tensor<32xf16> loc(callsite("aten::arange"("<eval_with_key>.5":8:13) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        %2 = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%0 : tensor<32xf16>) outs(%1 : tensor<32xf16>) {
        ^bb0(%in: f16 loc("aten::div"("<eval_with_key>.5":9:10)), %out: f16 loc("aten::reciprocal"("<eval_with_key>.5":11:17))):
          %3 = math.powf %cst, %in : f16 loc(callsite("aten::pow"("<eval_with_key>.5":10:12) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
          %4 = arith.cmpf one, %3, %cst_0 : f16 loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
          cf.assert %4, "unimplemented: tensor with zero element" loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
          %5 = arith.divf %cst_1, %3 : f16 loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
          linalg.yield %5 : f16 loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        } -> tensor<32xf16> loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        flow.dispatch.tensor.store %2, %arg1, offsets = [0], sizes = [32], strides = [1] : tensor<32xf16> -> !flow.dispatch.tensor<writeonly:tensor<32xf16>> loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
        return loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
      } loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))
    } loc(callsite("aten::reciprocal"("<eval_with_key>.5":11:17) at "aten::reciprocal"("<eval_with_key>.5":11:17)))

Useiree-opt to delete the loc info
Manually delete the flow dialect to get the z2.mlir in the following hanhanW's comment.

hanhanW · 2023-12-13T21:22:55Z

Thank you! I can reproduce the issue starting with the dispatch.

#map = affine_map<(d0) -> (d0)>
func.func @main(%0: tensor<32xf16>) -> tensor<32xf16>{
  %cst = arith.constant 1.000000e+04 : f16
  %cst_0 = arith.constant 0.000000e+00 : f16
  %cst_1 = arith.constant 1.000000e+00 : f16
  %1 = tensor.empty() : tensor<32xf16>
  %2 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} ins(%0 : tensor<32xf16>) outs(%1 : tensor<32xf16>) {
  ^bb0(%in: f16, %out: f16):
    %3 = math.powf %cst, %in : f16
    %4 = arith.cmpf one, %3, %cst_0 : f16
    cf.assert %4, "unimplemented: tensor with zero element"
    %5 = arith.divf %cst_1, %3 : f16
    linalg.yield %5 : f16
  } -> tensor<32xf16>
  return %2 : tensor<32xf16>
}

Compile to vmfb: iree-compile --output-format=vm-bytecode --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=cascadelake --iree-llvmcpu-target-triple=x86_64-unknown-linux-gnu ~/z2.mlir -o /tmp/a.vmfb --iree-llvmcpu-enable-ukernels=all

Run the module: iree-run-module --device=local-sync --module=/tmp/a.vmfb --function=main --input=32xf16="0 0.03125 0.0625 0.09375 0.125 0.15625 0.1875 0.21875 0.25 0.28125 0.3125 0.34375 0.375 0.40625 0.4375 0.46875 0.5 0.53125 0.5625 0.59375 0.625 0.65625 0.6875 0.71875 0.75 0.78125 0.8125 0.84375 0.875 0.90625 0.9375 0.96875"

Then I got the output:

EXEC @main
result[0]: hal.buffer_view
32xf16=-NAN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I am taking a look at the dispatch.

hanhanW · 2023-12-13T21:56:38Z

I think there is a bug in PolynomialApproximation pass. We have wrong approximation for math.powf op.

https://github.com/openxla/iree/blob/5889a12cf737abed894f9ef095c52265259b9d90/compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp#L657-L658

I strip the dispatch to make it only have a single powf op, e.g.,

#map = affine_map<(d0) -> (d0)>
module {
  func.func @main(%arg0: tensor<32xf16>) -> tensor<32xf16> {
    %cst = arith.constant 1.000000e+04 : f16
    %cst_0 = arith.constant 0.000000e+00 : f16
    %cst_1 = arith.constant 1.000000e+00 : f16
    %0 = tensor.empty() : tensor<32xf16>
    %1 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} ins(%arg0 : tensor<32xf16>) outs(%0 : tensor<32xf16>) {
    ^bb0(%in: f16, %out: f16):
      %2 = math.powf %cst, %in : f16
      linalg.yield %2 : f16
    } -> tensor<32xf16>
    return %1 : tensor<32xf16>
  }
}

running with the input returns NAN and INF.

❯ build/tools/iree-run-module --device=local-sync --module=/tmp/a.vmfb --function=main --input=32xf16="0 0.03125 0.0625 0.09375 0.125 0.15625 0.1875 0.21875 0.25 0.28125 0.3125 0.34375 0.375 0.40625 0.4375 0.46875 0.5 0.53125 0.5625 0.59375 0.625 0.65625 0.6875 0.71875 0.75 0.78125 0.8125 0.84375 0.875 0.90625 0.9375 0.96875"
EXEC @main
result[0]: hal.buffer_view
32xf16=-NAN INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF

If I comment out the pass, we can get reasonable outputs:

❯ build/tools/iree-run-module --device=local-sync --module=/tmp/a.vmfb --function=main --input=32xf16="0 0.03125 0.0625 0.09375 0.125 0.15625 0.1875 0.21875 0.25 0.28125 0.3125 0.34375 0.375 0.40625 0.4375 0.46875 0.5 0.53125 0.5625 0.59375 0.625 0.65625 0.6875 0.71875 0.75 0.78125 0.8125 0.84375 0.875 0.90625 0.9375 0.96875"
EXEC @main
result[0]: hal.buffer_view
32xf16=1 1.33398 1.77832 2.37109 3.16211 4.21875 5.625 7.5 10 13.3359 17.7812 23.7188 31.625 42.1562 56.2188 75 100 133.375 177.875 237.125 316.25 421.75 562.5 750 1000 1334 1778 2372 3162 4216 5624 7500

The implementation is at https://github.com/llvm/llvm-project/blob/2a9d8caf29ca2b2cf4758db31c64fd20cb5eb3bf/mlir/lib/Dialect/Math/Transforms/ExpandPatterns.cpp#L165-L192

@bviyer @rsuderman can you help review if the approximation is correct?

hanhanW · 2023-12-14T19:14:56Z

I have a workaround for the issue: #15927

We can remove the workaround after fixing the polynomial approximation issue.

There is a bug in polynomial approximation. It generates `NAN` and `INF` for fp16 types. This is a workaround to get it functional. See #15661 for more details. Also rework on the maximumf test. The generic op is not a common input because it uses `outs` while there are no reduction loops.

AmosLewis · 2023-12-15T00:50:25Z

I have a workaround for the issue: #15927

We can remove the workaround after fixing the polynomial approximation issue.

PYTHON TEST FAIL. Detail is here chatglm_fail_1214.txt

hanhanW · 2023-12-15T01:17:18Z

#10 0x00007f7096798990 mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::QuantizedMatmulRewriter::precondition() /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:330:61
#11 0x00007f70967982de mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::reassociateDequantMatmul(mlir::RewriterBase&, mlir::linalg::GenericOp, mlir::linalg::GenericOp, int) /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:767:18

I think you are running into new issues. The mlir file is regenerated, and we can not compile it using IREE main branch. It crashes in FuseDequantizationMatmul.cpp, @Max191 can you coordinate with @AmosLewis on the crash?

Max191 · 2023-12-15T15:27:05Z

#10 0x00007f7096798990 mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::QuantizedMatmulRewriter::precondition() /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:330:61
#11 0x00007f70967982de mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::reassociateDequantMatmul(mlir::RewriterBase&, mlir::linalg::GenericOp, mlir::linalg::GenericOp, int) /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:767:18

I think you are running into new issues. The mlir file is regenerated, and we can not compile it using IREE main branch. It crashes in FuseDequantizationMatmul.cpp, @Max191 can you coordinate with @AmosLewis on the crash?

Downloading the model now. I'll try to repro once it's downloaded. Is there a specific iree-compile command I should try? Otherwise I'll just use whatever chatglm.py is doing.

AmosLewis · 2023-12-15T17:43:39Z

#10 0x00007f7096798990 mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::QuantizedMatmulRewriter::precondition() /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:330:61
#11 0x00007f70967982de mlir::iree_compiler::GlobalOptimization::(anonymous namespace)::reassociateDequantMatmul(mlir::RewriterBase&, mlir::linalg::GenericOp, mlir::linalg::GenericOp, int) /nodclouddata/chi/src/iree/compiler/src/iree/compiler/GlobalOptimization/FuseDequantizationMatmul.cpp:767:18
I think you are running into new issues. The mlir file is regenerated, and we can not compile it using IREE main branch. It crashes in FuseDequantizationMatmul.cpp, @Max191 can you coordinate with @AmosLewis on the crash?
Downloading the model now. I'll try to repro once it's downloaded. Is there a specific iree-compile command I should try? Otherwise I'll just use whatever chatglm.py is doing.

chatglm.py should be enough. It would be better to use chatglm.py to repeat the error locally. It will download the model from huggingface and use torch_mlir.compile to generate and save the mlir model in chatglm-6b-int4.mlir. Then use shark_module.save_module to run the iree-compile. If you look at the chatglm_fail_log_1214.txt line 611 there is an equivalent iree-compile cmd there you can use:
iree-compile chatglm-6b-int4.mlir --iree-input-type=tm_tensor --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/nodclouddata/chi/src/iree-build/compiler/bindings/python/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-llvmcpu-enable-ukernels --iree-llvmcpu-stack-allocation-limit=256000 --iree-global-opt-enable-quantized-matmul-reassociation --iree-stream-resource-max-allocation-size=4294967295 --iree-vm-bytecode-module-strip-source-map=true --iree-util-zero-fill-elided-attrs --iree-opt-strip-assertions=false --verify=true

Max191 · 2023-12-15T21:14:31Z

@AmosLewis I am getting this same error even when generating with chatglm.py:

iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
Error code: 1
Diagnostics:
chatglm-6b-int4.mlir:0:0: error: attempting to parse a byte at the end of the bytecode
chatglm-6b-int4.mlir:0:0: note: in bytecode version 6 produced by: MLIR18.0.0git

Is there something I need to do other than running the script with ToM SHARK?

Max191 · 2023-12-15T21:48:11Z

@AmosLewis Can you try generating with a fresh venv on ToM shark if you haven't already? We aren't able to reproduce the error you're hitting, and I want to make sure we have the same environment and versions for everything.

AmosLewis · 2023-12-15T21:49:54Z

@AmosLewis I am getting this same error even when generating with chatglm.py:
iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
Error code: 1
Diagnostics:
chatglm-6b-int4.mlir:0:0: error: attempting to parse a byte at the end of the bytecode
chatglm-6b-int4.mlir:0:0: note: in bytecode version 6 produced by: MLIR18.0.0git
Is there something I need to do other than running the script with ToM SHARK?

I have seen this error. You can
pip uninstall the iree-compiler and iree-runtimes. Then setup the pythonpath to yout local iree_build/
export PYTHONPATH=/nodclouddata/chi/src/iree-build/compiler/bindings/python:/nodclouddata/chi/src/iree-build/runtime/bindings/python::$PYTHONPATH
The iree commit is on bc0b7d42bbd04b4af0a86eb56556ad8fcc6985a2. This is to make sure HANHAN's fix of math.power is enabled.
I have listed this info in the comments of chatglm_fail_1214.txt

AmosLewis · 2023-12-15T21:51:47Z

@AmosLewis Can you try generating with a fresh venv on ToM shark if you haven't already? We aren't able to reproduce the error you're hitting, and I want to make sure we have the same environment and versions for everything.

I have listed the venv and iree version info in the comments of chatglm_fail_1214.txt

Max191 · 2023-12-18T16:47:57Z

@AmosLewis Thanks for pointing me to that info! I was able to reproduce and fix the issue on my side. The quantized matmul reassociation wasn't meant to support f16, but was not failing gracefully. I went ahead and added f16 support with #15964, and I was able to compile the model. Let me know if you still have any issues after picking this.

AmosLewis · 2023-12-18T17:49:00Z

@AmosLewis Thanks for pointing me to that info! I was able to reproduce and fix the issue on my side. The quantized matmul reassociation wasn't meant to support f16, but was not failing gracefully. I went ahead and added f16 support with #15964, and I was able to compile the model. Let me know if you still have any issues after picking this.

Thanks. I will try your patch on my side. Could you also run the vmfb buy this run_chatglm.py on you side? It tries to run the chatglm-9.vmfb generated by chatglm.py

AmosLewis · 2023-12-18T19:21:06Z

With all the previous fix(#15927 and #15964), the compile error fix but the NAN issue still exist.

(shark.venv) ➜  SHARK git:(main) ✗ python nan/run_chatglm.py
tensor([[64790, 64792, 36474, 54591]]) torch.Size([1, 4])
/nodclouddata/chi/src/SHARK/nan/run_chatglm.py:13: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_ids = torch.tensor(input_ids).reshape([1, input_id_len])
Loading module /nodclouddata/chi/src/SHARK/chatglm.vmfb...
::: Detailed report (took longer than 2.5s):
  +0.8661746978759766ms: get_iree_runtime_config
  +20850.444555282593ms: mmap /nodclouddata/chi/src/SHARK/chatglm.vmfb
  +20850.829124450684ms: ireert.SystemContext created
  +20853.740215301514ms: module initialized
Successfully Loaded vmfb model
inputsid: 
tensor([[64790, 64792, 36474, 54591]])
output:
[[[nan nan nan ... nan nan nan]
  [nan nan nan ... nan nan nan]
  [nan nan nan ... nan nan nan]
  [nan nan nan ... nan nan nan]]]

hanhanW · 2023-12-18T19:27:36Z

Can you triage the issue like what we've done above, and attach the reproducer like #15661 (comment)?

AmosLewis · 2023-12-19T01:05:31Z

Can you triage the issue like what we've done above, and attach the reproducer like #15661 (comment)?

Here is what I got chatglm_fail_log_dispatch9_1218_with_max_15964.txt. It still break at the dispatch9 but stuck here for about 40mins at INF this time. I append the repeat step in the comments as well.

hanhanW · 2023-12-19T01:29:41Z

=== jit_eval_8_dispatch_0::jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16 inputs ===
f32=-INF
f16=0

=== jit_eval_8_dispatch_0::jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16 outputs ===
4x4xf16=[0 -INF -INF -INF][0 0 -INF -INF][0 0 0 -INF][0 0 0 0]

It looks like othe dispatches generate -INF and pass it to jit_eval_8_dispatch_0. We should look above log to see where the first NAN/INF is generated. Here is a tip that I can think:

grep -B 5 --max-count=1 -n NAN /path-to-log
grep -B 5 --max-count=1 -n INF /path-to-log

This should navigate you to the first place that generates NAN/INF.

AmosLewis · 2023-12-19T01:36:39Z

=== jit_eval_8_dispatch_0::jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16 inputs ===
f32=-INF
f16=0

=== jit_eval_8_dispatch_0::jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16 outputs ===
4x4xf16=[0 -INF -INF -INF][0 0 -INF -INF][0 0 0 -INF][0 0 0 0]
It looks like othe dispatches generate -INF and pass it to jit_eval_8_dispatch_0. We should look above log to see where the first NAN/INF is generated. Here is a tip that I can think:
grep -B 5 --max-count=1 -n NAN /path-to-log
grep -B 5 --max-count=1 -n INF /path-to-log
This should navigate you to the first place that generates NAN/INF.

(shark.venv) ➜  tmp git:(main) ✗ grep -B 5 --max-count=1 -n NAN ./1218_chatglm_forward9-dispatch-tensors.txt 
(shark.venv) ➜  tmp git:(main) ✗ grep -B 5 --max-count=1 -n INF ./1218_chatglm_forward9-dispatch-tensors.txt
53-
54-=== jit_eval_6_dispatch_0::jit_eval_6_dispatch_0_transpose outputs ===
55-f16=0
56-
57-=== jit_eval_8_dispatch_0::jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16 inputs ===
58:f32=-INF

I didn't find the any dispatch output INF to 8. I also try to print the annotation here 1218_chatglm_forward9-dispatch-tensors-annotation.mlir. Then search the jit_eval_8_dispatch_0_generic_4x4_f32xf16xf16.

hanhanW · 2023-12-19T01:57:48Z

I know what's happening... This is happening in const-eval stage, so all the inputs for these dispatches are constant data. It means that the frontend generates invalid constants or IREE reads the weights incorrectly. There are two things in my mind:

Do we add f64->f32 demotion in the frontend?
Can you check if the frontend generates valid weight?

hanhanW · 2023-12-19T01:59:33Z

Do we add f64->f32 demotion in the frontend?

If the weight is in f64 type, and we can't represent it using f32 type, it could become INF or -INF.

Can you check if the frontend generates valid weight?

If the original weight is invalid, the bug is in the model itself.

AmosLewis · 2023-12-19T02:07:37Z

Do we add f64->f32 demotion in the frontend?

If the weight was in f64 type, and we can represent it using f32 type, it could become INF or -INF.

Can you check if the frontend generates valid weight?

If the original weight is invalid, the bug is in the model itself.

I just elide the input chatglm-6b-int4.mlir by torch-mlir-opt --mlir-elide-elementsattrs-if-larger=4 chatglm-6b-int4.mlir > chatglm-6b-int4-elide.mlir and search. Here is the https://storage.googleapis.com/shark-public/chi/iree/chatglm/9/1218/chatglm-6b-int4-elide.mlir. If we search f64 to f32, there are 57 results to do the demotion. There are tons of f64 to f16 as well. It look like:

%cst_427 = arith.constant 1.000000e-05 : f64
...    
%28 = linalg.generic {indexing_maps = [#map11, #map1], iterator_types = ["parallel", "parallel", "parallel"]} ins(%27 : tensor<4x1x1xf32>) outs(%24 : tensor<4x1x1xf32>) {
    ^bb0(%in: f32, %out: f32):
      %2160 = arith.truncf %cst_427 : f64 to f32
      %2161 = arith.addf %in, %2160 : f32
      linalg.yield %2161 : f32
    } -> tensor<4x1x1xf32>

%cst_428 = arith.constant 0.29730177875068026 : f64
...
%78 = linalg.generic {indexing_maps = [#map24, #map8], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%75 : tensor<1x32x4x128xf16>) outs(%74 : tensor<1x32x4x128xf16>) {
    ^bb0(%in: f16, %out: f16):
      %2160 = arith.truncf %cst_428 : f64 to f16
      %2161 = arith.mulf %in, %2160 : f16
      linalg.yield %2161 : f16
    } -> tensor<1x32x4x128xf16>

hanhanW · 2023-12-19T02:20:26Z

I have to go. One other thing we can try is adding iree-llvmcpu-use-fast-min-max-ops flag to iree-compile. I don't know what the inputs are, but maybe they were always NaN/INF and now they are propagated as such.

(we should also rename the flag -- I will take a look tomorrow)

harrisonGPU · 2023-12-19T14:01:53Z

I know what's happening... This is happening in const-eval stage, so all the inputs for these dispatches are constant data. It means that the frontend generates invalid constants or IREE reads the weights incorrectly. There are two things in my mind:

Do we add f64->f32 demotion in the frontend?

Can you check if the frontend generates valid weight?

Hello， @hanhanW
Could you please tell me where the IREE frontend generates invalid constants, or where IREE reads the weights? I'm a new student and I'm eager to learn about the IREE open-source project by investigating this issue.Thank you for your help! I truly appreciate it as I start exploring the IREE project.

…rg#15927) There is a bug in polynomial approximation. It generates `NAN` and `INF` for fp16 types. This is a workaround to get it functional. See iree-org#15661 for more details. Also rework on the maximumf test. The generic op is not a common input because it uses `outs` while there are no reduction loops.

hanhanW · 2023-12-20T04:34:24Z

update:

We can run the model without NaN on cascadelake in a clean build. Perhaps it can only be reproduced on haswell CPU. I'm setting up an env on @AmosLewis VM, and see if I can reproduce the issue.

hanhanW · 2023-12-20T06:37:50Z

I am able to produce reasonable output even on the same VM, if you don't use --iree-global-opt-enable-quantized-matmul-reassociation. IMO, the flag is off by default, which means that it is a development flag. That path is not fully tested.

My experiments show that it is the root cause about NaN. It produces NaN only if I added the flag. I don't know why it is added, but can we exclude the flag for now?

AmosLewis · 2023-12-20T18:33:45Z

I am able to produce reasonable output even on the same VM, if you don't use --iree-global-opt-enable-quantized-matmul-reassociation. IMO, the flag is off by default, which means that it is a development flag. That path is not fully tested.

My experiments show that it is the root cause about NaN. It produces NaN only if I added the flag. I don't know why it is added, but can we exclude the flag for now?

It looks like we are adding it here in shark https://github.com/nod-ai/SHARK/blob/788cc9157c942a4c6f73e3a85f16b14c9ce4d4d5/shark/iree_utils/compile_utils.py#L46. @dan-garvey @monorimet Can help disable it in shark.

Max191 · 2023-12-20T18:51:35Z

I am able to produce reasonable output even on the same VM, if you don't use --iree-global-opt-enable-quantized-matmul-reassociation. IMO, the flag is off by default, which means that it is a development flag. That path is not fully tested.
My experiments show that it is the root cause about NaN. It produces NaN only if I added the flag. I don't know why it is added, but can we exclude the flag for now?

It looks like we are adding it here in shark https://github.com/nod-ai/SHARK/blob/788cc9157c942a4c6f73e3a85f16b14c9ce4d4d5/shark/iree_utils/compile_utils.py#L46. @dan-garvey @monorimet Can help disable it in shark.

Yeah, we don't want to be adding this flag for anything other than llama2 on CPU. It is needed for llama2 performance, but it is still experimental.

AmosLewis · 2023-12-20T23:55:01Z

Use shark with this commit nod-ai/SHARK-Studio#2047 the NAN issuue should be fix this issue. Could you try @manishghop?

(shark.venv) ➜  nan git:(main) ✗ python run_chatglm.py
/home/chi/src/SHARK/shark.venv/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
tensor([[64790, 64792, 36474, 54591]]) torch.Size([1, 4])
/home/chi/src/SHARK/nan/run_chatglm.py:13: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  input_ids = torch.tensor(input_ids).reshape([1, input_id_len])
Loading module chatglm.vmfb...
Successfully Loaded vmfb model
::: Detailed report (took longer than 5.0s):
  +0.3094673156738281ms: Load to device: torch.Size([1, 4])
  +0.5853176116943359ms: Invoke function: forward
  +6925.025939941406ms: Invoke complete
  +6925.110816955566ms: Result to host
[[[-10.83   -10.83     0.533  ... -10.84   -10.83   -10.84  ]
  [-12.5    -12.52     2.217  ... -12.54   -12.53   -12.51  ]
  [ -9.59    -9.59    -0.3699 ...  -9.62    -9.62    -9.61  ]
  [ -9.586   -9.58     1.07   ...  -9.56    -9.58    -9.57  ]]]

…15964) This adds support for reassociating f16 typed quantized matmuls, fixing a bug reported in #15661.

AmosLewis · 2024-01-13T00:20:27Z

related issue
#16068

manishghop added the bug 🐞 Something isn't working label Nov 21, 2023

AmosLewis self-assigned this Dec 5, 2023

AmosLewis mentioned this issue Dec 6, 2023

Numerical issue with qwen7b.vmfb model #15665

Closed

hanhanW added codegen Shared code generation infrastructure and dialects codegen/llvm LLVM code generation compiler backend labels Dec 13, 2023

hanhanW assigned hanhanW and unassigned AmosLewis Dec 13, 2023

hanhanW mentioned this issue Dec 14, 2023

[CPU] Add support for converting math.powf from fp16 to fp32. #15927

Merged

hanhanW assigned rsuderman and unassigned hanhanW Dec 14, 2023

hanhanW assigned AmosLewis and Max191 and unassigned rsuderman Dec 15, 2023

hanhanW mentioned this issue Dec 15, 2023

The polynomial approximation for f16 math.powf generates NAN and INF #15936

Open

Max191 mentioned this issue Dec 18, 2023

[GlobalOpt] Add quantized matmul reassociation support for f16 types #15964

Merged

AmosLewis closed this as completed Dec 21, 2023

Max191 added a commit that referenced this issue Dec 21, 2023

[GlobalOpt] Add quantized matmul reassociation support for f16 types (#…

0af34bd

…15964) This adds support for reassociating f16 typed quantized matmuls, fixing a bug reported in #15661.

AmosLewis mentioned this issue Jan 13, 2024

Error while conversion of .mlir to .vmfb #16068

Open

Numerical issue with chatglm2.vmfb model #15661

Numerical issue with chatglm2.vmfb model #15661

Comments

manishghop commented Nov 21, 2023

What happened?

Steps to reproduce your issue

What component(s) does this issue relate to?

Version information

Additional context

stellaraccident commented Nov 28, 2023

AmosLewis commented Dec 6, 2023

AmosLewis commented Dec 11, 2023

AmosLewis commented Dec 12, 2023 • edited Loading

hanhanW commented Dec 12, 2023

AmosLewis commented Dec 13, 2023 • edited Loading

hanhanW commented Dec 13, 2023

hanhanW commented Dec 13, 2023

AmosLewis commented Dec 13, 2023 • edited Loading

AmosLewis commented Dec 13, 2023

AmosLewis commented Dec 13, 2023 • edited Loading

hanhanW commented Dec 13, 2023

hanhanW commented Dec 13, 2023

hanhanW commented Dec 14, 2023

AmosLewis commented Dec 15, 2023

hanhanW commented Dec 15, 2023 • edited Loading

Max191 commented Dec 15, 2023

AmosLewis commented Dec 15, 2023 • edited Loading

Max191 commented Dec 15, 2023

Max191 commented Dec 15, 2023

AmosLewis commented Dec 15, 2023 • edited Loading

AmosLewis commented Dec 15, 2023 • edited Loading

Max191 commented Dec 18, 2023

AmosLewis commented Dec 18, 2023

AmosLewis commented Dec 18, 2023 • edited Loading

hanhanW commented Dec 18, 2023

AmosLewis commented Dec 19, 2023 • edited Loading

hanhanW commented Dec 19, 2023

AmosLewis commented Dec 19, 2023 • edited Loading

hanhanW commented Dec 19, 2023

hanhanW commented Dec 19, 2023 • edited Loading

AmosLewis commented Dec 19, 2023 • edited Loading

hanhanW commented Dec 19, 2023

harrisonGPU commented Dec 19, 2023

hanhanW commented Dec 20, 2023

hanhanW commented Dec 20, 2023

AmosLewis commented Dec 20, 2023 • edited Loading

Max191 commented Dec 20, 2023

AmosLewis commented Dec 20, 2023 • edited Loading

AmosLewis commented Jan 13, 2024

AmosLewis commented Dec 12, 2023 •

edited

Loading

AmosLewis commented Dec 13, 2023 •

edited

Loading

AmosLewis commented Dec 13, 2023 •

edited

Loading

AmosLewis commented Dec 13, 2023 •

edited

Loading

hanhanW commented Dec 15, 2023 •

edited

Loading

AmosLewis commented Dec 15, 2023 •

edited

Loading

AmosLewis commented Dec 15, 2023 •

edited

Loading

AmosLewis commented Dec 15, 2023 •

edited

Loading

AmosLewis commented Dec 18, 2023 •

edited

Loading

AmosLewis commented Dec 19, 2023 •

edited

Loading

AmosLewis commented Dec 19, 2023 •

edited

Loading

hanhanW commented Dec 19, 2023 •

edited

Loading

AmosLewis commented Dec 19, 2023 •

edited

Loading

AmosLewis commented Dec 20, 2023 •

edited

Loading

AmosLewis commented Dec 20, 2023 •

edited

Loading