Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitvector_next() returns incorrect results on Fedora armv7hf with system LLVM 3.7.0 #13752

Closed
nalimilan opened this issue Oct 24, 2015 · 17 comments
Labels
system:arm ARMv7 and AArch64

Comments

@nalimilan
Copy link
Member

I initially observed this in #10602. When building Julia with LLVM 3.7.0 and USE_SYSTEM_LLVM=1, I get a failure in inference which I traced back to bugs in IntSet. The error happens in first(s::IntSet), due to next(IntSet([1]), 0)[1] returning apparently random values like 4690168797640785920 == 0x4116d3ac00000000, all with the 4 lower bytes equal to zero.

This comes from these buggy results:

ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 0, 10)
0x0000000000000000

ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 1, 10)
0x0000000100000000

ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 2, 10)
0x0000000200000000

ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000003], 0, 10)
0x0000000000000000

What I don't understand is how/where the conversion of these values to Int64 gives the very large numbers that next returns.

The problem does not happen with USE_SYSTEM_LLVM=0 LLVM_VER=3.7.0.

Could it have something to do with an incorrect ccall ABI, as indicated by the "ccall is defaulting to llvm ABI, since no platform ABI has been defined for this CPU/OS combination" warning?

@nalimilan nalimilan added the system:arm ARMv7 and AArch64 label Oct 24, 2015
@ViralBShah
Copy link
Member

@nalimilan I can give you access to a scaleway arm machine if it helps to have a second machine to help work through this. If so, please send me your public key and I will set it up.

@nalimilan
Copy link
Member Author

If I put twice the same line in first(s::IntSet) in base/intset.jl:

ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 0, 10)
ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 0, 10)

I get a segfault:

Starting program: /home/fedora/nalimilan/julia2/usr/bin/julia -C cortex-a8 --output-ji /home/fedora/nalimilan/julia2/usr/lib/julia/inference0.ji -f coreimg.jl
[...]
inference.jl

Program received signal SIGSEGV, Segmentation fault.
bitvector_next (b=0x9540d4f0, n0=<optimized out>, n=13761737689319604224)
    at bitvector.c:112
112         w = b[i]>>nb;
(gdb) ba
#0  bitvector_next (b=0x9540d4f0, n0=<optimized out>, n=13761737689319604224)
    at bitvector.c:112
#1  0x94de52ac in julia_first_1100 ()
#2  0xb625ee20 in jl_apply (nargs=1, args=0xbefb8710, f=<optimized out>)
    at /home/fedora/nalimilan/julia2/src/julia.h:1328
#3  jl_apply_unspecialized (meth=<optimized out>, meth=<optimized out>, 
    nargs=1, args=0xbefb8710) at /home/fedora/nalimilan/julia2/src/gf.c:32
#4  jl_apply_generic (F=<optimized out>, args=0xbefb8710, nargs=1)
    at /home/fedora/nalimilan/julia2/src/gf.c:1683
#5  0x94f07368 in julia_typeinf_uncached_778 ()
#6  0xb625ee20 in jl_apply (nargs=7, args=0xbefb8848, f=<optimized out>)
    at /home/fedora/nalimilan/julia2/src/julia.h:1328
#7  jl_apply_unspecialized (meth=<optimized out>, meth=<optimized out>, 
    nargs=7, args=0xbefb8848) at /home/fedora/nalimilan/julia2/src/gf.c:32
#8  jl_apply_generic (F=<optimized out>, args=0xbefb8848, nargs=7)
    at /home/fedora/nalimilan/julia2/src/gf.c:1683
#9  0x94f15908 in julia_typeinf_744 ()
#10 0xb625ee20 in jl_apply (nargs=6, args=0xbefb88d8, f=<optimized out>)
    at /home/fedora/nalimilan/julia2/src/julia.h:1328
#11 jl_apply_unspecialized (meth=<optimized out>, meth=<optimized out>, 
    nargs=6, args=0xbefb88d8) at /home/fedora/nalimilan/julia2/src/gf.c:32
#12 jl_apply_generic (F=<optimized out>, args=0xbefb88d8, nargs=6)

Adding only one of the two ccall doesn't crash (as in the original issue description).

@nalimilan
Copy link
Member Author

@ViralBShah Thanks, but for now I'm not sure that would really help me. I've found more than my share of things to debug on a single machine! :-) I'm rather in need of guidance.

@ViralBShah
Copy link
Member

FWIW, I don't see these segfaults on scaleway running both the ccalls.

@nalimilan
Copy link
Member Author

Note I only see them when setting USE_SYSTEM_LLVM=1 on Fedora. Not sure it's CPU-specific or not.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 24, 2015

calling convention looks fine for this call (ccall tests will segfault due to the warning however):

~/julia/src/support$ clang --target=arm-pc-linux-elf -mfloat-abi=hard -emit-llvm -x c -S -o - -
typedef int uint32_t; typedef long long uint64_t;
uint64_t bitvector_next(uint32_t *b, uint64_t n0, uint64_t n)
{ return 0; }
^D
define arm_aapcs_vfpcc i64 @bitvector_next(i32* %b, i64 %n0, i64 %n) #0 {

@nalimilan
Copy link
Member Author

@vtjnash I get the same result here:

$ clang --target=arm-pc-linux-elf -mfloat-abi=hard -emit-llvm -x c -S -o - -
[...]
define arm_aapcs_vfpcc i64 @bitvector_next(i32* %b, i64 %n0, i64 %n) #0 {

$ clang --target=armv7hl-redhat-linux-gnu -mfloat-abi=hard -emit-llvm -x c -S -o - -
[...]
define arm_aapcs_vfpcc i64 @bitvector_next(i32* %b, i64 %n0, i64 %n) #0 {

What do you mean by "ccall tests will segfault due to the warning however"? Due to the "ccall is defaulting to llvm ABI" warning? Does it explain why the ccalls fail here?

I've tried building the in-tree LLVM with the same options as the system package, but I'm unable to reproduce the same failure. What can be so specific about the system LLVM, given that it's built the same way on the same hardware?

For the record, the build command I used is:
make MARCH=armv7-a JULIA_CPU_TARGET=cortex-a8 LLVM_VER=3.7.0 USE_SYSTEM_LIBM=1 LIBBLAS=-lblas LIBBLASNAME=libblas.so.3 LIBLAPACK=-llapack LIBLAPACKNAME=liblapack.so.3 USE_SYSTEM_LIBUNWIND=1 USE_SYSTEM_READLINE=1 USE_SYSTEM_PCRE=0 USE_SYSTEM_OPENSPECFUN=1 USE_SYSTEM_BLAS=1 USE_SYSTEM_LAPACK=1 USE_SYSTEM_FFTW=1 USE_SYSTEM_GMP=1 USE_SYSTEM_MPFR=1 USE_SYSTEM_ARPACK=1 USE_SYSTEM_SUITESPARSE=1 USE_SYSTEM_ZLIB=1 USE_SYSTEM_GRISU=1 USE_SYSTEM_DSFMT=1 USE_SYSTEM_LIBUV=0 USE_SYSTEM_RMATH=0 USE_SYSTEM_UTF8PROC=0 USE_SYSTEM_LIBGIT2=1 USE_SYSTEM_PATCHELF=1 VERBOSE=1 USE_BLAS64=0 LLVM_FLAGS="--prefix=/home/fedora/nalimilan/julia4/usr libdir=/home/fedora/nalimilan/julia4/usr/lib --with-extra-ld-options=-Wl,-Bsymbolic --disable-polly --disable-libcpp --enable-cxx11 --enable-clang-arcmt --enable-clang-static-analyzer --enable-clang-rewriter --enable-optimized --disable-profiling --disable-assertions --disable-werror --disable-expensive-checks --enable-debug-runtime --enable-keep-symbols --enable-jit --enable-docs --enable-doxygen --disable-doxygen --enable-threads --enable-pthreads --enable-zlib --enable-pic --enable-shared --disable-embed-stdcxx --enable-timestamps --enable-backtraces --enable-targets=x86,powerpc,arm,aarch64,cpp,nvptx,systemz,r600 --enable-bindings=ocaml --enable-bindings=none --enable-libffi --enable-ltdl-install --with-cpu=cortex-a8 --with-tune=cortex-a8 --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-abi=aapcs-vfp" OPTIMIZE_OPTION="-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches"

@yuyichao
Copy link
Contributor

yuyichao commented Jan 3, 2016

I don't see this issue with ArchLinux system LLVM 3.7 either. Maybe could you attach the disassembly of the ccall (@code_native of a function wrapping it) and the function bitvector_next so that we can check ABI compatibility directly?

@nalimilan
Copy link
Member Author

@yuyichao Sorry for the delay. Unfortunately, when I run this code from intset.jl:

function azerty()
    ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64), [0x00000002], 0, 10)
end

res=_dump_function(azerty, (),  true, false, false, false)
ccall(:jl_, Void, (Any,), res)

I get

WARNING: Unable to find function pointer
""

@code_native is not available at this stage of bootstrap, but the call above should be equivalent AFAIK.

This is with (system) LLVM 3.7.1 plus Julia patches.

@yuyichao
Copy link
Contributor

.... I'm not sure if _dump_funcion works during sysimg generation .....

Can you make the following change and run it with gdb?

diff --git a/base/intset.jl b/base/intset.jl
index 5b610ea..ec38735 100644
--- a/base/intset.jl
+++ b/base/intset.jl
@@ -1,5 +1,12 @@
 # This file is a part of Julia. License is MIT: http://julialang.org/license

+function azerty()
+    ccall(:bitvector_next, UInt64, (Ptr{UInt32}, UInt64, UInt64),
+          reinterpret(Ptr{Int32}, 0x12345678),
+          0x2221111111, 0x4444441111)
+end
+azerty()
+
 abstract AbstractSet{T}

 type IntSet <: AbstractSet{Int}
gdb --args ../julia -Ccortex-a9 -f --output-ji /dev/null coreimg.jl
(gdb) start
(gdb) br bitvector_next
(gdb) c
# <when you hit the breakpoint>
(gdb) disassemble $pc
# <paste output>
(gdb) up
(gdb) info registers
# <paste output>
(gdb) x/32b $sp - 16
# <paste output>
(gdb) x/40i $pc - (25 * 4)
# <paste output; decrease the 25 if you get a memory not accessible error>

@yuyichao
Copy link
Contributor

(Replace -Ccortex-a9 with the arch you use...)

@nalimilan
Copy link
Member Author

Thanks for the instructions. Here's what I get:

Breakpoint 2, bitvector_next (b=0x12345678, n0=4919075457707540514, 
    n=1311768464867721284) at bitvector.c:103
103 {
(gdb) disassemble $pc
Dump of assembler code for function bitvector_next:
=> 0xb6eb4994 <+0>: push    {r4, r5, r6, r7, r8, r9, lr}
   0xb6eb4998 <+4>: ldrd    r4, [sp, #28]
   0xb6eb499c <+8>: cmp r3, r5
   0xb6eb49a0 <+12>:    cmpeq   r2, r4
   0xb6eb49a4 <+16>:    bcs 0xb6eb4b20 <bitvector_next+396>
   0xb6eb49a8 <+20>:    adds    r8, r4, #31
   0xb6eb49ac <+24>:    mov r7, r3
   0xb6eb49b0 <+28>:    adc r9, r5, #0
   0xb6eb49b4 <+32>:    lsr r3, r2, #5
   0xb6eb49b8 <+36>:    lsr r1, r8, #5
   0xb6eb49bc <+40>:    orr r3, r3, r7, lsl #27
   0xb6eb49c0 <+44>:    orr r1, r1, r9, lsl #27
   0xb6eb49c4 <+48>:    mov r6, r2
   0xb6eb49c8 <+52>:    sub r1, r1, #1
   0xb6eb49cc <+56>:    and r2, r2, #31
   0xb6eb49d0 <+60>:    cmp r3, r1
   0xb6eb49d4 <+64>:    bcs 0xb6eb4a90 <bitvector_next+252>
   0xb6eb49d8 <+68>:    ldr r12, [r0, r3, lsl #2]
   0xb6eb49dc <+72>:    lsr r12, r12, r2
   0xb6eb49e0 <+76>:    cmp r12, #0
   0xb6eb49e4 <+80>:    bne 0xb6eb4abc <bitvector_next+296>
   0xb6eb49e8 <+84>:    cmp r3, r1
   0xb6eb49ec <+88>:    bne 0xb6eb4a00 <bitvector_next+108>
   0xb6eb49f0 <+92>:    b   0xb6eb4b20 <bitvector_next+396>
   0xb6eb49f4 <+96>:    ldr r2, [r0, r3, lsl #2]
   0xb6eb49f8 <+100>:   cmp r2, #0
   0xb6eb49fc <+104>:   bne 0xb6eb4b2c <bitvector_next+408>
   0xb6eb4a00 <+108>:   add r3, r3, #1
   0xb6eb4a04 <+112>:   cmp r1, r3
   0xb6eb4a08 <+116>:   bhi 0xb6eb49f4 <bitvector_next+96>
   0xb6eb4a0c <+120>:   ldr r3, [r0, r3, lsl #2]
   0xb6eb4a10 <+124>:   and r2, r4, #31
   0xb6eb4a14 <+128>:   cmp r3, #0
   0xb6eb4a18 <+132>:   beq 0xb6eb4b10 <bitvector_next+380>
   0xb6eb4a1c <+136>:   uxth    r1, r3
   0xb6eb4a20 <+140>:   cmp r1, #0
   0xb6eb4a24 <+144>:   lsreq   r3, r3, #16
   0xb6eb4a28 <+148>:   moveq   r0, #17
   0xb6eb4a2c <+152>:   movne   r0, #1
   0xb6eb4a30 <+156>:   moveq   r1, #25
   0xb6eb4a34 <+160>:   movne   r1, #9
   0xb6eb4a38 <+164>:   tst r3, #255    ; 0xff
   0xb6eb4a3c <+168>:   lsreq   r3, r3, #8
   0xb6eb4a40 <+172>:   moveq   r0, r1
   0xb6eb4a44 <+176>:   tst r3, #15
   0xb6eb4a48 <+180>:   lsreq   r3, r3, #4
   0xb6eb4a4c <+184>:   addeq   r0, r0, #4
   0xb6eb4a50 <+188>:   tst r3, #3
   0xb6eb4a54 <+192>:   lsreq   r3, r3, #2
   0xb6eb4a58 <+196>:   addeq   r0, r0, #2
   0xb6eb4a5c <+200>:   and r3, r3, #1
   0xb6eb4a60 <+204>:   cmp r2, #0
   0xb6eb4a64 <+208>:   rsb r3, r3, r0
   0xb6eb4a68 <+212>:   moveq   r1, #0
   0xb6eb4a6c <+216>:   moveq   r0, r3
   0xb6eb4a70 <+220>:   beq 0xb6eb4b88 <bitvector_next+500>
   0xb6eb4a74 <+224>:   cmp r2, r3
   0xb6eb4a78 <+228>:   bls 0xb6eb4b20 <bitvector_next+396>
   0xb6eb4a7c <+232>:   subs    r4, r4, r2
   0xb6eb4a80 <+236>:   sbc r5, r5, #0
   0xb6eb4a84 <+240>:   adds    r0, r4, r3
   0xb6eb4a88 <+244>:   adc r1, r5, #0
   0xb6eb4a8c <+248>:   pop {r4, r5, r6, r7, r8, r9, pc}
   0xb6eb4a90 <+252>:   and r8, r4, #31
   0xb6eb4a94 <+256>:   mov r9, #0
   0xb6eb4a98 <+260>:   orrs    r12, r8, r9
   0xb6eb4a9c <+264>:   beq 0xb6eb49d8 <bitvector_next+68>
   0xb6eb4aa0 <+268>:   ldr lr, [r0, r3, lsl #2]
   0xb6eb4aa4 <+272>:   and r12, r4, #31
   0xb6eb4aa8 <+276>:   mvn r8, #0
   0xb6eb4aac <+280>:   bic r12, lr, r8, lsl r12
   0xb6eb4ab0 <+284>:   lsr r12, r12, r2
   0xb6eb4ab4 <+288>:   cmp r12, #0
   0xb6eb4ab8 <+292>:   beq 0xb6eb49e8 <bitvector_next+84>
   0xb6eb4abc <+296>:   uxth    r3, r12
   0xb6eb4ac0 <+300>:   cmp r3, #0
   0xb6eb4ac4 <+304>:   lsreq   r12, r12, #16
   0xb6eb4ac8 <+308>:   moveq   r3, #17
   0xb6eb4acc <+312>:   movne   r3, #1
   0xb6eb4ad0 <+316>:   moveq   r2, #25
   0xb6eb4ad4 <+320>:   movne   r2, #9
   0xb6eb4ad8 <+324>:   tst r12, #255   ; 0xff
   0xb6eb4adc <+328>:   lsreq   r12, r12, #8
   0xb6eb4ae0 <+332>:   moveq   r3, r2
   0xb6eb4ae4 <+336>:   tst r12, #15
   0xb6eb4ae8 <+340>:   lsreq   r12, r12, #4
   0xb6eb4aec <+344>:   addeq   r3, r3, #4
   0xb6eb4af0 <+348>:   tst r12, #3
   0xb6eb4af4 <+352>:   lsreq   r12, r12, #2
   0xb6eb4af8 <+356>:   addeq   r3, r3, #2
   0xb6eb4afc <+360>:   and r12, r12, #1
   0xb6eb4b00 <+364>:   rsb r12, r12, r3
   0xb6eb4b04 <+368>:   adds    r0, r6, r12
   0xb6eb4b08 <+372>:   adc r1, r7, #0
   0xb6eb4b0c <+376>:   pop {r4, r5, r6, r7, r8, r9, pc}
   0xb6eb4b10 <+380>:   cmp r2, #0
   0xb6eb4b14 <+384>:   moveq   r0, #32
   0xb6eb4b18 <+388>:   moveq   r1, #0
   0xb6eb4b1c <+392>:   beq 0xb6eb4b88 <bitvector_next+500>
   0xb6eb4b20 <+396>:   mov r0, r4
   0xb6eb4b24 <+400>:   mov r1, r5
   0xb6eb4b28 <+404>:   pop {r4, r5, r6, r7, r8, r9, pc}
   0xb6eb4b2c <+408>:   uxth    r1, r2
   0xb6eb4b30 <+412>:   lsl r4, r3, #5
   0xb6eb4b34 <+416>:   cmp r1, #0
   0xb6eb4b38 <+420>:   lsr r5, r3, #27
   0xb6eb4b3c <+424>:   lsreq   r2, r2, #16
   0xb6eb4b40 <+428>:   moveq   r1, #17
   0xb6eb4b44 <+432>:   movne   r1, #1
   0xb6eb4b48 <+436>:   moveq   r0, #25
   0xb6eb4b4c <+440>:   movne   r0, #9
   0xb6eb4b50 <+444>:   tst r2, #255    ; 0xff
   0xb6eb4b54 <+448>:   lsreq   r2, r2, #8
   0xb6eb4b58 <+452>:   moveq   r1, r0
   0xb6eb4b5c <+456>:   tst r2, #15
   0xb6eb4b60 <+460>:   lsreq   r2, r2, #4
   0xb6eb4b64 <+464>:   addeq   r1, r1, #4
   0xb6eb4b68 <+468>:   tst r2, #3
   0xb6eb4b6c <+472>:   lsreq   r2, r2, #2
   0xb6eb4b70 <+476>:   addeq   r1, r1, #2
   0xb6eb4b74 <+480>:   and r2, r2, #1
   0xb6eb4b78 <+484>:   rsb r2, r2, r1
   0xb6eb4b7c <+488>:   adds    r0, r4, r2
   0xb6eb4b80 <+492>:   adc r1, r5, #0
   0xb6eb4b84 <+496>:   pop {r4, r5, r6, r7, r8, r9, pc}
   0xb6eb4b88 <+500>:   subs    r4, r4, #32
   0xb6eb4b8c <+504>:   sbc r5, r5, #0
   0xb6eb4b90 <+508>:   adds    r0, r0, r4
   0xb6eb4b94 <+512>:   adc r1, r1, r5
   0xb6eb4b98 <+516>:   pop {r4, r5, r6, r7, r8, r9, pc}
End of assembler dump.
(gdb) up
#1  0xb3fc0228 in ?? ()
(gdb) info register
r0             0x12345678   305419896
r1             0x21111111   554766609
r2             0x22 34
r3             0x44441111   1145311505
r4             0xbeffd830   3204438064
r5             0xb6eb4994   3068873108
r6             0x44441111   1145311505
r7             0x22 34
r8             0x21111111   554766609
r9             0xbeffd850   3204438096
r10            0x2ad3d0 2806736
r11            0xbeffd8a8   3204438184
r12            0xef4    3828
sp             0xbeffd828   0xbeffd828
lr             0xb3fc0228   -1275330008
pc             0xb3fc0228   0xb3fc0228
cpsr           0x60070010   1611071504
(gdb) x/32b $sp - 16
0xbeffd818: 80  -40 -1  -66 -48 -45 42  0
0xbeffd820: -88 -40 -1  -66 -28 1   -4  -77
0xbeffd828: 68  0   0   0   120 86  52  18
0xbeffd830: 42  0   0   0   -68 -40 -1  -66
(gdb) x/40i $pc - (25 * 4)
   0xb3fc01c4:  mov r0, r9
   0xb3fc01c8:  mov r1, #3
   0xb3fc01cc:  ldr r8, [r3]
   0xb3fc01d0:  ldr r7, [r3, #4]
   0xb3fc01d4:  str r10, [sp, #40]  ; 0x28
   0xb3fc01d8:  str r2, [sp, #44]   ; 0x2c
   0xb3fc01dc:  str r6, [sp, #48]   ; 0x30
   0xb3fc01e0:  bl  0xb3fc0294
   0xb3fc01e4:  mov r3, r0
   0xb3fc01e8:  ldr r2, [r5]
   0xb3fc01ec:  str r3, [sp, #36]   ; 0x24
   0xb3fc01f0:  ldr r0, [r3, #-4]
   0xb3fc01f4:  bfc r0, #0, #4
   0xb3fc01f8:  cmp r0, r2
   0xb3fc01fc:  bne 0xb3fc027c
   0xb3fc0200:  ldr r6, [r3]
   0xb3fc0204:  movw    r5, #18836  ; 0x4994
   0xb3fc0208:  ldr r1, [r3, #4]
   0xb3fc020c:  movt    r5, #46827  ; 0xb6eb
   0xb3fc0210:  ldr r0, [sp, #4]
   0xb3fc0214:  mov r2, r7
   0xb3fc0218:  str r1, [sp]
   0xb3fc021c:  mov r1, r8
   0xb3fc0220:  mov r3, r6
   0xb3fc0224:  blx r5
=> 0xb3fc0228:  bl  0xb3fc029c
   0xb3fc022c:  movw    r2, #61696  ; 0xf100
   0xb3fc0230:  ldr r1, [sp, #12]
   0xb3fc0234:  movt    r2, #46839  ; 0xb6f7
   0xb3fc0238:  str r1, [r2]
   0xb3fc023c:  sub sp, r11, #28
   0xb3fc0240:  pop {r4, r5, r6, r7, r8, r9, r10, r11, pc}
   0xb3fc0244:  movw    r0, #4120   ; 0x1018
   0xb3fc0248:  movw    r1, #61504  ; 0xf040
   0xb3fc024c:  movw    r2, #7072   ; 0x1ba0
   0xb3fc0250:  movt    r0, #46076  ; 0xb3fc
   0xb3fc0254:  movt    r1, #46006  ; 0xb3b6
   0xb3fc0258:  movt    r2, #108    ; 0x6c
   0xb3fc025c:  mov lr, pc
   0xb3fc0260:  b   0xb3fc02a4

@yuyichao
Copy link
Contributor

Hmm, so there IS a problem with the ABI, the arguments should be passed in

  • b: r0
  • n0: r2, r3 (since the argument is 8 bytes aligned)
  • n: [sp, sp + 4]

but are rather passed in

  • b: r0
  • n0: r1, r2
  • n: r3, [sp]

So it seems that the alignment of int64_t is not respected. Not sure whose fault it is yet....

Is cortex-a8 the arch you use? I can't check it now but I can have a look later.

Also c.c. @maleadt

@nalimilan
Copy link
Member Author

I used cortex-a8 because that's what Fedora uses, but I also tried cortex-a9 in the past, it didn't make a difference. Should it matter? The cpuinfo is the same as before (#10602 (comment)).

@yuyichao
Copy link
Contributor

yuyichao commented Oct 7, 2016

Just a random guess, it seems that the mismatch in calling convention could be that LLVM is somehow configured to use the legacy ABI instead of EABI.

@nalimilan
Copy link
Member Author

The build flags used by Fedora are listed here: http://pkgs.fedoraproject.org/cgit/rpms/llvm3.7.git/tree/llvm3.7.spec#n194 This includes --with-abi=aapcs-vfp. gcc -dumpmachine says armv7hl-redhat-linux-gnueabi. Does that sound OK?

Though the recent packages have moved to CMake and no longer pass these flags, so I could try again.

@KristofferC
Copy link
Sponsor Member

Please reopen if this still happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:arm ARMv7 and AArch64
Projects
None yet
Development

No branches or pull requests

5 participants