Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation fails on NVIDIA Jetson Orin #20

Closed
Rooholla-KhorramBakht opened this issue Jul 18, 2024 · 3 comments · Fixed by #21
Closed

Installation fails on NVIDIA Jetson Orin #20

Rooholla-KhorramBakht opened this issue Jul 18, 2024 · 3 comments · Fixed by #21
Labels
bug Something isn't working

Comments

@Rooholla-KhorramBakht
Copy link

Describe the bug
Compilation fails on NVIDIA Jetson Orin.

To Reproduce
Either in docker or locally, running:

pip install -v .

returns error:

.
.
.
FAILED: CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o
25.84   /usr/bin/aarch64-linux-gnu-g++  -pthread -D_core_ext_EXPORTS -I/usr/include/python3.8 -I/tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/nanobind-src/include -isystem /tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/nigh-src/src -isystem /tmp/pip-req-build-g07z5nn4/build/cp38-cp38-linux_aarch64/_deps/pdqsort-src -isystem /tmp/pip-req-build-g07z5nn4/src/impl -isystem /usr/include/eigen3 -mcpu=native -mtune=native -Wall -Wextra -O3 -DNDEBUG -O3 -fno-math-errno -fno-signed-zeros -fno-trapping-math -fno-rounding-math -ffp-contract=fast -flto=auto -std=c++17 -fPIC -fvisibility=hidden -fno-stack-protector -ffunction-sections -fdata-sections -MD -MT CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o -MF CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o.d -o CMakeFiles/_core_ext.dir/src/impl/vamp/bindings/settings.cc.o -c /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc
25.84   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.84                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.84                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.84   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh: In static member function ‘static constexpr vamp::SIMDVector<__vector(4) float>::VectorT vamp::SIMDVector<__vector(4) float>::lshift_dispatch(vamp::SIMDVector<__vector(4) float>::VectorT)’:
25.85   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:219:53: error: cannot convert ‘uint32x4_t’ {aka ‘__vector(4) unsigned int’} to ‘float32x4_t’ {aka ‘__vector(4) float’}
25.85     219 |             return vreinterpretq_u32_f32(vshlq_n_u32(vreinterpretq_u32_f32(v), i));
25.85         |                                          ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25.85         |                                                     |
25.85         |                                                     uint32x4_t {aka __vector(4) unsigned int}
25.85   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:12,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.85                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.85   /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:6004:36: note:   initializing argument 1 of ‘uint32x4_t vreinterpretq_u32_f32(float32x4_t)’
25.85    6004 | vreinterpretq_u32_f32 (float32x4_t __a)
25.85         |                        ~~~~~~~~~~~~^~~
25.85   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.86   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh: In static member function ‘static constexpr vamp::SIMDVector<__vector(4) float>::VectorT vamp::SIMDVector<__vector(4) float>::rshift_dispatch(vamp::SIMDVector<__vector(4) float>::VectorT)’:
25.86   /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:243:53: error: cannot convert ‘uint32x4_t’ {aka ‘__vector(4) unsigned int’} to ‘float32x4_t’ {aka ‘__vector(4) float’}
25.86     243 |             return vreinterpretq_u32_f32(vshrq_n_u32(vreinterpretq_u32_f32(v), i));
25.86         |                                          ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25.86         |                                                     |
25.86         |                                                     uint32x4_t {aka __vector(4) unsigned int}
25.86   In file included from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector/neon.hh:12,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/vector.hh:9,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/planning/roadmap.hh:11,
25.86                    from /tmp/pip-req-build-g07z5nn4/src/impl/vamp/bindings/settings.cc:1:
25.86   /usr/lib/gcc/aarch64-linux-gnu/9/include/arm_neon.h:6004:36: note:   initializing argument 1 of ‘uint32x4_t vreinterpretq_u32_f32(float32x4_t)’
25.86    6004 | vreinterpretq_u32_f32 (float32x4_t __a)
25.86         |                        ~~~~~~~~~~~~^~~
.
.
.    

Expected behavior
Installation to complete with no errors.

Environment:

  • OS: Ubuntu 22.04 (Jetpack 6) Jetson Orin
  • Python version: 3.10.12
  • GCC/Clang version: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
@Rooholla-KhorramBakht Rooholla-KhorramBakht added the bug Something isn't working label Jul 18, 2024
@wbthomason
Copy link
Contributor

Thanks for your report! We unfortunately don't have a Jetson to test on, so we might need some further help from you to debug. As a first step, can you share the exact specs of your Jetson (i.e. which model, the output of lscpu) so that we know what version of NEON it supports and can figure out why the reinterpret call in question is a problem?

@wbthomason
Copy link
Contributor

Ah, nevermind - I think I see the issue.

@Rooholla-KhorramBakht
Copy link
Author

Thanks a lot for your quick response. The CPU info is:

Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               ARM
  Model name:            Cortex-A78AE
    Model:               1
    Thread(s) per core:  1
    Core(s) per cluster: 4
    Socket(s):           -
    Cluster(s):          3
    Stepping:            r0p1
    CPU max MHz:         2201.6001
    CPU min MHz:         115.2000
    BogoMIPS:            62.50
    Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp
                          asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcp
                         c flagm paca pacg
Caches (sum of all):     
  L1d:                   768 KiB (12 instances)
  L1i:                   768 KiB (12 instances)
  L2:                    3 MiB (12 instances)
  L3:                    6 MiB (3 instances)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Mitigation; CSV2, but not BHB
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants