Installation fails on NVIDIA Jetson Orin #20

Rooholla-KhorramBakht · 2024-07-18T18:41:51Z

Describe the bug
Compilation fails on NVIDIA Jetson Orin.

To Reproduce
Either in docker or locally, running:

pip install -v .

returns error:

.
.
.
FAILED: CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o
25.84   /usr/bin/aarch64-linux-gnu-g++  -pthread -D_core_ext_EXPORTS -I/usr/include/python3.8 -I/tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/nanobind-src/include -isystem /tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/nigh-src/src -isystem /tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/pdqsort-src -isystem /tmp/pip-req-build-g07z5nn4/src/impl -isystem /usr/include/eigen3 -mcpu=native -mtune=native -Wall -Wextra -O3 -DNDEBUG -O3 -fno-math-errno -fno-signed-zeros -fno-trapping-math -fno-rounding-math -ffp-contract=fast -flto=auto -std=c++17 -fPIC -fvisibility=hidden -fno-stack-protector -ffunction-sections -fdata-sections -MD -MT CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o -MF CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o.d -o CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o -c /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc
25.84   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.84                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.84                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.84   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh: In static member function ‘static constexpr vamp::SIMDVector<__vector(4) float>::VectorT vamp::SIMDVector<__vector(4) float>::lshift_dispatch(vamp::SIMDVector<__vector(4) float>::VectorT)’:
25.85   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:219:53: error: cannot convert ‘uint32x4_t’ {aka ‘__vector(4) unsigned int’} to ‘float32x4_t’ {aka ‘__vector(4) float’}
25.85     219 |             return vreinterpretq_u32_f32(vshlq_n_u32(vreinterpretq_u32_f32(v), i));
25.85         |                                          ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25.85         |                                                     |
25.85         |                                                     uint32x4_t {aka __vector(4) unsigned int}
25.85   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:12,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.85   /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:6004:36: note:   initializing argument 1 of ‘uint32x4_t vreinterpretq_u32_f32(float32x4_t)’
25.85    6004 | vreinterpretq_u32_f32 (float32x4_t __a)
25.85         |                        ~~~~~~~~~~~~^~~
25.85   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.86   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh: In static member function ‘static constexpr vamp::SIMDVector<__vector(4) float>::VectorT vamp::SIMDVector<__vector(4) float>::rshift_dispatch(vamp::SIMDVector<__vector(4) float>::VectorT)’:
25.86   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:243:53: error: cannot convert ‘uint32x4_t’ {aka ‘__vector(4) unsigned int’} to ‘float32x4_t’ {aka ‘__vector(4) float’}
25.86     243 |             return vreinterpretq_u32_f32(vshrq_n_u32(vreinterpretq_u32_f32(v), i));
25.86         |                                          ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25.86         |                                                     |
25.86         |                                                     uint32x4_t {aka __vector(4) unsigned int}
25.86   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:12,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.86   /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:6004:36: note:   initializing argument 1 of ‘uint32x4_t vreinterpretq_u32_f32(float32x4_t)’
25.86    6004 | vreinterpretq_u32_f32 (float32x4_t __a)
25.86         |                        ~~~~~~~~~~~~^~~
.
.
.

Expected behavior
Installation to complete with no errors.

Environment:

OS: Ubuntu 22.04 (Jetpack 6) Jetson Orin
Python version: 3.10.12
GCC/Clang version: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

The text was updated successfully, but these errors were encountered:

wbthomason · 2024-07-18T19:05:08Z

Thanks for your report! We unfortunately don't have a Jetson to test on, so we might need some further help from you to debug. As a first step, can you share the exact specs of your Jetson (i.e. which model, the output of lscpu) so that we know what version of NEON it supports and can figure out why the reinterpret call in question is a problem?

wbthomason · 2024-07-18T19:09:53Z

Ah, nevermind - I think I see the issue.

Rooholla-KhorramBakht · 2024-07-18T19:12:02Z

Thanks a lot for your quick response. The CPU info is:

Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               ARM
  Model name:            Cortex-A78AE
    Model:               1
    Thread(s) per core:  1
    Core(s) per cluster: 4
    Socket(s):           -
    Cluster(s):          3
    Stepping:            r0p1
    CPU max MHz:         2201.6001
    CPU min MHz:         115.2000
    BogoMIPS:            62.50
    Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp
                          asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcp
                         c flagm paca pacg
Caches (sum of all):     
  L1d:                   768 KiB (12 instances)
  L1i:                   768 KiB (12 instances)
  L2:                    3 MiB (12 instances)
  L3:                    6 MiB (3 instances)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Mitigation; CSV2, but not BHB
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Rooholla-KhorramBakht added the bug Something isn't working label Jul 18, 2024

wbthomason mentioned this issue Jul 18, 2024

Fix typo bug with vreinterpretq + CI improvements #21

Merged

Rooholla-KhorramBakht mentioned this issue Jul 19, 2024

fix vshrq_n_u32 type in neon.hh #22

Closed

zkingston closed this as completed in #21 Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation fails on NVIDIA Jetson Orin #20

Installation fails on NVIDIA Jetson Orin #20

Rooholla-KhorramBakht commented Jul 18, 2024

wbthomason commented Jul 18, 2024

wbthomason commented Jul 18, 2024

Rooholla-KhorramBakht commented Jul 18, 2024

Installation fails on NVIDIA Jetson Orin #20

Installation fails on NVIDIA Jetson Orin #20

Comments

Rooholla-KhorramBakht commented Jul 18, 2024

wbthomason commented Jul 18, 2024

wbthomason commented Jul 18, 2024

Rooholla-KhorramBakht commented Jul 18, 2024