Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in binary created by zig cc related to pthread_cond_wait (musl x86_64) #7095

Closed
mil opened this issue Nov 13, 2020 · 7 comments
Closed
Labels
bug Observed behavior contradicts documented or intended behavior os-linux
Milestone

Comments

@mil
Copy link
Contributor

mil commented Nov 13, 2020

I'm currently trying to use libsoundio via zig and am experiencing problems with running libsoundio's soundio_flush_events function (via alsa driver). The test code I'm using is sio_sine.c from libsoundio's examples:
https://github.com/andrewrk/libsoundio/blob/master/example/sio_sine.c

Example, compiling with gcc, things work normally:

gcc -o sio_sine sio_sine.c -lsoundio
./sio_sine

However compiling with zig cc, I get a segmentation fault:

zig cc -lc -lsoundio sio_sine.c
./sio_sine
  zsh: segmentation fault  ./sio_sine

The root cause / the line that trips it up is soundio_flush_events and under the hood, the line that trips that up has to do with pthread_cond_wait: https://github.com/andrewrk/libsoundio/blob/master/src/os.c#L548

I'm on Alpine Linux / musl and have successfully used the same code with glibc, so I think this is a musl-specific issue. Zig code similar to the C example produces similar results, works in glibc zig, but not in musl zig.

@mil mil changed the title Segfault with libsoundio sio_sine.c binary produced with zig cc while gcc binary works (musl x86_64) Segfault related pthread_cond_wait via zig cc (musl x86_64) Nov 13, 2020
@mil mil changed the title Segfault related pthread_cond_wait via zig cc (musl x86_64) Segfault in binary created by zig cc related to pthread_cond_wait (musl x86_64) Nov 13, 2020
@LemonBoy
Copy link
Contributor

The most common problem with musl is that the default thread size is too small.
I don't have a musl toolchain to try compiling it myself, if you have a statically-linked bin (with debug infos) upload it somewhere and I'll have a look.

@mil
Copy link
Contributor Author

mil commented Nov 14, 2020

I've been struggling to produce a static binary - thing is alsa & libsoundio are dynamicaly linked by default on Alpine.

Also another piece for debugging info, binary produced with clang also works fine:

clang -lsoundio sio_sine.c -o sio_sine

So really I think this comes down to some flag that's being passed to clang with zig cc.

Also based on your suggestion about stack size @LemonBoy, I tried going into the libsoundio code and raising the stack size via pthread_attr_setstacksize however it was no use / the segfault still happened. I need to figure out a better way to debug..

@LemonBoy
Copy link
Contributor

I need to figure out a better way to debug..

Try running the program under gdb: gdb ./sio_sine and get a backtrace with bt, that should shed some light on why it's crashing.

@mil
Copy link
Contributor Author

mil commented Nov 14, 2020

Thanks for the suggestion - so I've installed gdb & debug symbols and I'm finding I think this has something to do with the way zig is interacting with alsa's shared library. I would think it's a bug with alsa, but the fact that things are working in gcc/clang with the same code and same shared library convince me that this is a zig bug.

GDB trace for sio_sine.c from libsoundio:

~/libsoundio/example> zig cc sio_sine.c -lsoundio -g -O0
~/libsoundio/example> gdb ./sio_sine
GNU gdb (GDB) 9.2
Reading symbols from ./sio_sine...
(gdb) run
Starting program: /home/m/libsoundio/example/sio_sine
[New LWP 13644]
Backend: ALSA

Thread 2 "sio_sine" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 13644]
0x00007ffff7eafe6d in snd_lib_error_set_local (func=func@entry=0x7ffff7eb165f <zero_handler>) at error.c:80
80      error.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7eafe6d in snd_lib_error_set_local (func=func@entry=0x7ffff7eb165f <zero_handler>) at error.c:80
#1  0x00007ffff7eb1b45 in try_config (config=config@entry=0x23af40, list=list@entry=0x7ffff7e70ca8, base=<optimized out>,
    name=<optimized out>) at namehint.c:243
#2  0x00007ffff7eb2908 in add_software_devices (list=0x7ffff7e70ca8, rw_config=0x23af40, config=<optimized out>)
    at namehint.c:522
#3  snd_device_name_hint (card=<optimized out>, iface=<optimized out>, hints=0x7ffff7e70da8) at namehint.c:604
#4  0x00007ffff7f5cdbb in ?? () from /usr/lib/libsoundio.so.2
#5  0x00007ffff7f5dac5 in ?? () from /usr/lib/libsoundio.so.2
#6  0x00007ffff7f59061 in ?? () from /usr/lib/libsoundio.so.2
#7  0x000000000020b5a7 in start (p=0x7ffff7e71ee8) at /usr/lib/zig/libc/musl/src/thread/pthread_create.c:192
#8  0x000000000020cf6b in __clone () at /usr/lib/zig/libc/musl/src/thread/x86_64/clone.s:22
#9  0x0000000000000000 in ?? ()
(gdb)

Also I have discovered a segfault with alsa' PCM test program too (https://www.alsa-project.org/alsa-doc/alsa-lib/_2test_2pcm_8c-example.html). Same results, works in gcc & clang, doesn't work with zig cc. Here's the backtrace:

~/foo> zig cc pcm.c -lasound -g -O0
~/foo> gdb ./pcm
GNU gdb (GDB) 9.2
Reading symbols from ./pcm...
(gdb) run
Starting program: /home/m/foo/pcm
Playback device is plughw:0,0
Stream parameters are 44100Hz, S16_LE, 1 channels
Sine wave rate is 440.0000Hz
Using transfer method: write

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7ec2c76 in snd_lib_error_default (file=0x7ffff7f25016 "conf.c", line=3683,
    function=0x7ffff7f25ad0 <__func__.10270> "snd_config_hooks_call", err=0,
    fmt=0x7ffff7f2567b "Cannot open shared library %s (%s)") at error.c:102
102     error.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7ec2c76 in snd_lib_error_default (file=0x7ffff7f25016 "conf.c", line=3683,
    function=0x7ffff7f25ad0 <__func__.10270> "snd_config_hooks_call", err=0,
    fmt=0x7ffff7f2567b "Cannot open shared library %s (%s)") at error.c:102
#1  0x00007ffff7ebf238 in snd_config_hooks_call (root=root@entry=0x22e120, config=config@entry=0x22e820, private_data=0x0)
    at conf.c:3683
#2  0x00007ffff7ebf35a in snd_config_hooks (config=0x22e120, private_data=0x0) at conf.c:3731
#3  0x00007ffff7ebf80b in snd_config_update_r (_top=_top@entry=0x7ffff7f64160 <snd_config>,
    _update=_update@entry=0x7ffff7f64170 <snd_config_global_update>, cfgs=cfgs@entry=0x0) at conf.c:4149
#4  0x00007ffff7ebf9b4 in snd_config_update_ref (top=top@entry=0x7fffffffe450) at conf.c:4205
#5  0x00007ffff7ed7685 in snd_pcm_open (pcmp=0x7fffffffe918, name=0x20391b "plughw:0,0", stream=SND_PCM_STREAM_PLAYBACK,
    mode=0) at pcm.c:2671
#6  0x000000000020d5a1 in main (argc=1, argv=0x7fffffffeb28) at pcm.c:836
(gdb) 

@mil
Copy link
Contributor Author

mil commented Nov 14, 2020

Commenting out snd_lib_error_default within alsa-lib and rebuilding the shared library fixes the problem. E.g. I just commented out the entire body of this function:
https://github.com/alsa-project/alsa-lib/blob/master/src/error.c#L100

And then the binary produced with zig cc runs fine. So there's something going on with how errors are being handled in musl zig that's off in this case.

@LemonBoy
Copy link
Contributor

Ok, I've managed to reproduce the problem in a alpine chroot.
The process crashes because the TLS DTV is empty, I believe it has to do with ZIg's obsession to statically link everything while on alpine the libc is dynamically linked.

@andrewrk
Copy link
Member

#5364

@andrewrk andrewrk added the bug Observed behavior contradicts documented or intended behavior label Nov 16, 2020
@andrewrk andrewrk added this to the 0.8.0 milestone Nov 16, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.8.1 Jun 4, 2021
@andrewrk andrewrk modified the milestones: 0.8.1, 0.9.1 Aug 31, 2021
@andrewrk andrewrk modified the milestones: 0.9.1, 0.9.0, 0.10.0 Nov 20, 2021
@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Sep 14, 2022
@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior os-linux
Projects
None yet
Development

No branches or pull requests

3 participants