-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: musl libc setxid/setgroups signals clobber stacks / do not use SA_ONSTACK #39857
Comments
I didn't run any of the tests, but read a bunch of comments, since this seems an interesting problem... As far as I understood: test9400 is checking if the handling of setxid (i.e. setuid, setgid, ...) class of system calls may smash the go stack, as setxid() is implemented by sending signals to all threads to fullfill POSIX requirements. (see linux man setgid) In order to prevent stack overrun by signals one usually creates an alternate signal stack and providing the SA_ONSTACK while installing signal handlers, to use the alternate stack. This way one cannot overrun the original stack. glibc doesn't set it, but it installs a signal handler for SIGSETXID at startup in nptl-init.c . In PR#9400 therefore it is possible to enumerate all signal handlers and add a missing SA_ONSTACK flag, fixing the issue on glibc. musl doesn't implement an alternate stack and SA_ONSTACK for their internal signal implementation of setxid either. This is actually confirmed by @richfelker in a somewhat related issue #19938 (comment) . Unfortunately the fix of #9400 doesn't apply to musl, since the signal handler is dynamically installed when setxid is called by the __synccall() function in src/thread/synccall.c of musl . I would vote for adding SA_ONSTACK to musl's __synccall implementation. |
There's a thread from 2019 on this topic: sigaltstack for implementation-internal signals? that never reached a conclusion. Basically I'm unclear whether it's arguably conforming for the implementation to use the alternate signal stack for implementation-internal signals, since it may have observable side effects on the application in the absence of any signals/signal-handlers setup to run on the alternate stack. |
If musl doesn't use I don't see any way to fix this in Go. I don't see what change we could make that would make things work better. |
@ianlancetaylor I don't follow how it "doesn't work with any program that uses |
Note that I've reopened the topic on the musl list: https://www.openwall.com/lists/musl/2020/08/09/1 |
Hi @richfelker , I tested: It is sufficient to have a patch like this on musl If there is an alternate stack, it will use it. Go does create an alternate stack. |
@richfelker I'm assuming that a program that calls |
Nobody said anything about wanting to receive signals on the normal stack. From the relevant perspective these aren't signals. They are asynchronous use of the alt signal stack by the implementation in a way the application isn't and can't be aware of. |
That is a valid perspective. But it also a valid perspective for a program to say "I am in control of my stack, and do not use my stack for any unexpected purpose. In particular, don't use it to catch signals." In any case I'm not sure there is anything we can do here in the Go standard library. If musl decides not to change, then as far as I can see code like this can't work on musl. So perhaps we should close this issue. |
@ianlancetaylor: I have been wanting to change this for a while (see the 2019 thread), but I'm making sure we actually consider the consequences of such a change and whether they break anything that someone can reasonably expect to work. (My leaning is that they don't, but I like to explore this kind of thing thoroughly since making hasty decisions has bitten us in the past.) The point of my bringing these things up is not to argue against the change, but to make sure it's well-supported when (technically if, but most likely when) it's made. |
Understood. (I suppose musl could also change to act as glibc does. Is there an advantage to only installing the signal handler when a relevant libc function is called?) |
Yes, it avoids syscall spam (strace) and wasted time in processes (the vast, vast majority) that don't need the handler. And I don't see how the glibc behavior makes it any easier unless you're poking at implementation internals which are not a stable interface. The signal numbers used for these internal signals are not a public interface, and they're not even pokable via public interfaces (as far as the public interfaces are concerned, the reserved signal numbers simply are not existant signals). The only way you can poke at them is via directly making syscalls, and this will break if signal handling is ever wrapped (which has been considered at times, but turned out we could always get by without it). |
From: golang/go#39857 This test always fails on musl, because certain things behave differently to glibc. So... sod it, ignore the test
Change https://go.dev/cl/419995 mentions this issue: |
These changes are enough to pass all.bash using the disabled linux-amd64-alpine builder via debugnewvm. For #19938. For #39857. Change-Id: I7d160612259c77764b70d429ad94f0864689cdce Reviewed-on: https://go-review.googlesource.com/c/go/+/419995 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Russ Cox <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
These changes are enough to pass all.bash using the disabled linux-amd64-alpine builder via debugnewvm. For golang#19938. For golang#39857. Change-Id: I7d160612259c77764b70d429ad94f0864689cdce Reviewed-on: https://go-review.googlesource.com/c/go/+/419995 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Russ Cox <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
I rediscovered this issue and posted a summary at #54306 (comment) before discovering this issue, where I came to more-or-less the same conclusions as the discussion in this thread. @richfelker regarding #39857 (comment), did you ever make any progress on the 2019 thread? It is rather unfortunate to have to tell users that using setxid functions with musl+Go is broken and will cause random memory corruption. |
Change https://go.dev/cl/425001 mentions this issue: |
Thanks for the ping. I don't think we ever came up with a good reason we can't use the alt stack for this, so I'm inclined to go ahead and switch to using it. |
Alternately: @nmeum, is it feasible for the test to detect the Alpine or |
You can check the |
Change https://go.dev/cl/521975 mentions this issue: |
Updates golang/go#39857 Change-Id: Ibb2f27947d9f5f5856862dc78465215964850e95 Reviewed-on: https://go-review.googlesource.com/c/build/+/521975 Auto-Submit: Carlos Amedee <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
The builder was updated to Alpine 3.18 in go.dev/cl/521975. |
I don't think this is quite completed — there are still skips in the codebase referring to this issue: Those skips should be removed before the issue is closed to ensure that this does not regress. |
Apart from Test9400 there are additional tests which were disabled on Alpine and should now just pass with recent musl libc. See: golang/go#39857 (comment)
Thanks for pointing that out, I wasn't aware that additional tests were disabled because of this. Just FYI: I just enable these tests again for our Alpine Linux downstream package and they pass on all of our architectures. |
[ commit 68b45368dd53d570477e8c14cef83a33a2b2d886 ] Apart from Test9400 there are additional tests which were disabled on Alpine and should now just pass with recent musl libc. See: golang/go#39857 (comment)
Ping. Any chances these tests could be enabled upstream again as well? |
Change https://go.dev/cl/563096 mentions this issue: |
This is a follow up to #39343 where I already briefly mentioned this problem. This issue is probably related to musl libc I can reliably reproduce it on Alpine Linux which uses musl.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Started the
TestCrossPackageTests
frommisc/cgo/test/pkg_test.go
:What did you expect to see?
A successful test run.
What did you see instead?
An error message:
The text was updated successfully, but these errors were encountered: