Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unit-conversions SIGSEGV on armv7hl #303

Closed
dkopecek opened this issue Sep 4, 2016 · 16 comments
Closed

unit-conversions SIGSEGV on armv7hl #303

dkopecek opened this issue Sep 4, 2016 · 16 comments
Labels
kind: bug platform: arm related to ARM architecture solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope)

Comments

@dkopecek
Copy link
Contributor

dkopecek commented Sep 4, 2016

Hello,
when trying to build and run the test suite on armv7hl architecture, I get the following results:

make[1]: Leaving directory '/builddir/build/BUILD/json-2.0.3/test'
test/json_unit "*"
json_unit is a Catch v1.5.6 host application.

Run with -? for options

value conversion
  get an object (explicit)

src/unit-conversions.cpp:41
src/unit-conversions.cpp:43: FAILED:
due to a fatal error condition:
  SIGSEGV - Segmentation violation signal

test cases:   14 |   13 passed | 1 failed
assertions: 4004 | 4003 passed | 1 failed
Makefile:28: recipe for target 'check' failed
RPM build errors:
make: *** [check] Error 245
@nlohmann
Copy link
Owner

nlohmann commented Sep 4, 2016

Thanks for reporting. Unfortunately, I do not have access to an armv7hl architecture, so I cannot debug into the problem. Can you please give me details on the compiler you are using?

@dkopecek
Copy link
Contributor Author

dkopecek commented Sep 4, 2016

The compiler is gcc-c++ 6.2.1. Here is the full build.log: https://kojipkgs.fedoraproject.org//work/tasks/7056/15497056/build.log

@dkopecek
Copy link
Contributor Author

I've managed to run the conversions unit test via valgrind on the armv7hl machine, this is the output:

+ valgrind ./test/json_unit '*conversion*'
==25420== Memcheck, a memory error detector
==25420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==25420== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==25420== Command: ./test/json_unit *conversion*
==25420== 
==25420== Invalid read of size 4
==25420==    at 0x4975034: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /usr/lib/libstdc++.so.6.0.22)
==25420==    by 0x92BEF: std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long long, unsigned long long, double, std::allocator> >::pair(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long long, unsigned long long, double, std::allocator> > const&) (in /builddir/build/BUILD/json-2.0.5/test/json_unit)
==25420==  Address 0x6 is not stack'd, malloc'd or (recently) free'd
==25420== 

@nlohmann
Copy link
Owner

Meanwhile, the build log at https://kojipkgs.fedoraproject.org//work/tasks/7056/15497056/build.log is no longer available. Do you have a copy?

@nlohmann
Copy link
Owner

nlohmann commented Oct 10, 2016

It may not be 100% comparable, but the tests (current develop version) succeed on a Raspberry Pi:

Valgrind reports:

==27237== Memcheck, a memory error detector
==27237== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==27237== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==27237== Command: test/json_unit
==27237== 
disInstr(arm): unhandled instruction: 0xF1010200
                 cond=15(0xF) 27:20=16(0x10) 4:4=0 3:0=0(0x0)
==27237== valgrind: Unrecognised instruction at address 0x4842588.
==27237==    at 0x4842588: ??? (in /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so)
==27237== Your program just tried to execute an instruction that Valgrind
==27237== did not recognise.  There are two possible reasons for this.
==27237== 1. Your program has a bug and erroneously jumped to a non-code
==27237==    location.  If you are running Memcheck and you just saw a
==27237==    warning about a bad jump, it's probably your program's fault.
==27237== 2. The instruction is legitimate but Valgrind doesn't handle it,
==27237==    i.e. it's Valgrind's fault.  If you think this is the case or
==27237==    you are not sure, please let us know and we'll try to fix it.
==27237== Either way, Valgrind will now raise a SIGILL signal which will
==27237== probably kill your program.
==27237== 
==27237== Process terminating with default action of signal 4 (SIGILL)
==27237==  Illegal opcode at address 0x4842588
==27237==    at 0x4842588: ??? (in /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so)
==27237== 
==27237== HEAP SUMMARY:
==27237==     in use at exit: 194 bytes in 7 blocks
==27237==   total heap usage: 7 allocs, 0 frees, 194 bytes allocated
==27237== 
==27237== 50 bytes in 3 blocks are possibly lost in loss record 4 of 5
==27237==    at 0x4833F2C: operator new(unsigned int) (vg_replace_malloc.c:282)
==27237==    by 0x49089E7: std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) (in /usr/lib/arm-linux-gnueabihf/libstdc++.so.6.0.20)
==27237== 
==27237== LEAK SUMMARY:
==27237==    definitely lost: 0 bytes in 0 blocks
==27237==    indirectly lost: 0 bytes in 0 blocks
==27237==      possibly lost: 50 bytes in 3 blocks
==27237==    still reachable: 144 bytes in 4 blocks
==27237==         suppressed: 0 bytes in 0 blocks
==27237== Reachable blocks (those to which a pointer was found) are not shown.
==27237== To see them, rerun with: --leak-check=full --show-reachable=yes
==27237== 
==27237== For counts of detected and suppressed errors, rerun with: -v
==27237== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Illegal instruction

That said - I cannot reproduce the error with a Raspberry Pi. I am still lacking access to a armv7hl system.

@nlohmann
Copy link
Owner

Thanks @navybk for helping debugging!

@dkopecek
Copy link
Contributor Author

@nlohmann Here's the build log as an attachment.

json-2.0.5-armv7-build.txt

@nlohmann
Copy link
Owner

nlohmann commented Nov 2, 2016

I fear I have to mark this issue as "won't fix"...

@dkopecek
Copy link
Contributor Author

dkopecek commented Nov 3, 2016

@nlohmann ok, I'll have to exclude amrv7 builds in Fedora then.

@nlohmann
Copy link
Owner

nlohmann commented Nov 3, 2016

Thanks for the quick response. Sorry I could not fix this, but I lack the hardware to debug into it.

@nlohmann nlohmann added the solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope) label Nov 3, 2016
@nlohmann nlohmann closed this as completed Nov 3, 2016
@mwittgen
Copy link

I was running into the same problem on armv7l. Using 32 bit data types instead of the default bsaic_json<> template fixed the problem. There seems to be a problem with memory alignment of structures on this platform.

@nlohmann
Copy link
Owner

@mwittgen Thanks for the hint! Can you provide the concrete instantiation of the templates so I can update the README?

@mwittgen
Copy link

mwittgen commented Feb 15, 2017

basic_json<std::map, std::vector, std::string, bool, std::int32_t, std::uint32_t, float>

I have only run our application code and switching to 32 bit values fixed it. This would need some detailed investigation to confirm what is actually going wrong. I might get to it later.

@nlohmann
Copy link
Owner

Thanks a lot! This is really strange...

@mwittgen
Copy link

I'll see if I can provide a simple test case with QEMU.

@mwittgen
Copy link

Forgot to mention gcc 5.2 on ZYNQ (armv7l) running ArchLinux. Compiled natively or cross compiled with crosstool-ng. Both executables crashed on invalid memory pointers in different places. Switching off optimization made the crash go away.

@nlohmann nlohmann added the platform: arm related to ARM architecture label Oct 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug platform: arm related to ARM architecture solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope)
Projects
None yet
Development

No branches or pull requests

3 participants