Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race: not working with Alpine based image #14481

Closed
dlsniper opened this issue Feb 23, 2016 · 75 comments
Closed

race: not working with Alpine based image #14481

dlsniper opened this issue Feb 23, 2016 · 75 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@dlsniper
Copy link
Contributor

Hi,

With the following docker image (save it as demo.docker)

FROM golang:1.6.0-alpine
MAINTAINER "[email protected]"

RUN apk add --update alpine-sdk \
    && rm -rf /var/cache/apk/*

run

docker build -f demo.docker -t race/demo .

Then you can finally run the command:

PROJECT_DIR="${PWD}" #assume we are in $GOPATH/src/github.com/dlsniper/demo on the computer
CONTAINER_PROJECT_DIR="/go/src/github.com/dlsniper/demo"
CONTAINER_PROJECT_GOPATH="${CONTAINER_PROJECT_DIR}/vendor:/go"

docker run --rm \
        --net="host" \
        -v ${PROJECT_DIR}:${CONTAINER_PROJECT_DIR} \
        -e CI=true \
        -e GODEBUG=netdns=go \
        -e CGO_ENABLED=1 \
        -e GOPATH=${CONTAINER_PROJECT_GOPATH} \
        -w "${CONTAINER_PROJECT_DIR}" \
        race/demo \
        go test -v -race ./...

This will fail with:

# runtime/race
race_linux_amd64.syso: In function `__sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul, 140737488355328ull, 0ul, __sanitizer::SizeClassMap<17ul, 64ul, 14ul>, 20ul, __sanitizer::TwoLevelByteMap<32768ull, 4096ull, __sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*)':
gotsan.cc:(.text+0x1681): undefined reference to `__libc_malloc'
race_linux_amd64.syso: In function `__sanitizer::ReExec()':
gotsan.cc:(.text+0xd937): undefined reference to `__libc_stack_end'
race_linux_amd64.syso: In function `__sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul, 140737488355328ull, 0ul, __sanitizer::SizeClassMap<17ul, 64ul, 14ul>, 20ul, __sanitizer::TwoLevelByteMap<32768ull, 4096ull, __sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*)':
gotsan.cc:(.text+0x5ec8): undefined reference to `__libc_free'
collect2: error: ld returned 1 exit status

If you then disable CGO and run again:

PROJECT_DIR="${PWD}" #assume we are in $GOPATH/src/github.com/dlsniper/demo on the computer
CONTAINER_PROJECT_DIR="/go/src/github.com/dlsniper/demo"
CONTAINER_PROJECT_GOPATH="${CONTAINER_PROJECT_DIR}/vendor:/go"

docker run --rm \
        --net="host" \
        -v ${PROJECT_DIR}:${CONTAINER_PROJECT_DIR} \
        -e CI=true \
        -e GODEBUG=netdns=go \
        -e CGO_ENABLED=0 \
        -e GOPATH=${CONTAINER_PROJECT_GOPATH} \
        -w "${CONTAINER_PROJECT_DIR}" \
        race/demo \
        go test -v -race ./...

It will result in the following output:

go test: -race requires cgo; enable cgo by setting CGO_ENABLED=1

Previously, in go 1.5.3, when running with CGO disabled, this used to fail with:

# testmain
runtime/race(.text): __libc_malloc: not defined
runtime/race(.text): getuid: not defined
runtime/race(.text): pthread_self: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): sleep: not defined
runtime/race(.text): usleep: not defined
runtime/race(.text): abort: not defined
runtime/race(.text): isatty: not defined
runtime/race(.text): __libc_free: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): pipe: not defined
runtime/race(.text): __libc_stack_end: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): exit: not defined
runtime/race(.text.unlikely): __errno_location: not defined
runtime/race(.text): undefined: __libc_malloc
/usr/local/go/pkg/tool/linux_amd64/link: too many errors

To test this just change the base image for the container.

Please let me know if there are any additional details I can share.

Thank you.

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Feb 23, 2016
@ianlancetaylor
Copy link
Contributor

CC @dvyukov

As far as I know this is because race.syso assumes a glibc based system. I don't know that there is a simple fix here.

@dvyukov
Copy link
Member

dvyukov commented Feb 24, 2016

I can't think of any simple fix. A complex fix would be to remove all dependencies on libc from race runtime (there is an open issue for that).
alpinelinux wiki suggests that there are some ways to run glibc-based programs on alpine:
http://wiki.alpinelinux.org/wiki/Running_glibc_programs
Don't know whether it will help for race detector or not.

@mbana
Copy link

mbana commented Feb 5, 2017

hi, all,

just wondering. does anyone have a complete working example of a golang program working with -race ideally not that much different from the official Alpine-based build.

out of interest, are the flags not tested? i would imagine that this sort of issue would be picked up at compile time, unless the Alpine-based is configured to to include it... no idea, just guessing.

anywho, working sample would be much appreciated.

@neelance
Copy link
Member

I have managed to build a race_linux_amd64.syso that works on Alpine. Here are the necessary changes: neelance/compiler-rt@32aa655 I haven't verified that it still works on other Linux platforms.

To build on Alpine:

  1. Clone https://github.com/neelance/compiler-rt/
  2. Go into /compiler-rt/lib/tsan/go/
  3. ./buildgo.sh (The test step crashes, but the library works anyways. My guess is that the test does not use a proper Go environment.)
  4. Copy race_linux_amd64.syso to $GOPATH/src/runtime/race/

I'm not sure how to continue from here. Any suggestions on getting that upstream?

@dvyukov
Copy link
Member

dvyukov commented Feb 24, 2017

Here are instructions on how to contribute to sanitizers:
https://github.com/google/sanitizers/wiki/AddressSanitizerHowToContribute

I skimmed through your patch and I think we can upstream something like this. But we need some testing story, otherwise it will break in future.

One good test might be to whitelist all libc symbol dependencies in buildgo.sh. This way we can ensure that we won't silently add a new dependency (ideally in future this list becomes empty).

@neelance
Copy link
Member

@dvyukov I see that you did most of the updates of the syso files. I have some more questions:

  1. How can we test the patch properly on the Go end across all platforms so we don't upstream something stupid?
  2. Can we somehow add Alpine to the platforms being tested?
  3. We need to fix that crash on the test phase of buildgo.sh. I've investigated it a bit, seems like mmap is failing for some shadow memory. My wild guess right now is that it expects a Go memory layout, but the test is a C program. That was as far as I could get without spending a massive amount of time to understand how that thread sanitizer works.

@dvyukov
Copy link
Member

dvyukov commented Feb 24, 2017

How can we test the patch properly on the Go end across all platforms so we don't upstream something stupid?

I now use golang.org/x/build/cmd/racebuild which builds and tests race runtime for all platforms. But it requires gomote access (I think you need to be a committer for Go repo). If you don't have that access, I will do testing. But test at least on Alpine and on glibc-based Linux.

Can we somehow add Alpine to the platforms being tested?

You need to either run your own Go builder and connect it to dashboard, or we need to setup Alpine linux builder on GCE. @bradfitz, do we want to setup/maintain it?

seems like mmap is failing for some shadow memory. My wild guess right now is that it expects a Go memory layout

It's possible. Does removing -fPIC -fpie from buildgo.sh help? We still need to build the runtime with -pie, but the test itself does not need to be pie.

@bradfitz
Copy link
Contributor

#17891 tracks adding an Alpine builder. @jessfraz has a (stalled :)) CL in progress.

@bradfitz
Copy link
Contributor

We now have Alpine builders, so in theory golang.org/x/build/cmd/racebuild could build an Alpine race binary now.

For now I'm just going to be skipping race tests on Alpine so we don't regress on other things.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/41678 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 25, 2017
…s on Alpine

In an effort to at least understand the complete set of things not
working on Alpine Linux, I've been trying to get the build passing
again, even with tests disabled.

The race detector is broken on Alpine. That is #14481 (and #9918).
So disable those tests for now.

Also, internal linking with PIE doesn't work on Alpine yet.
That is #18243. So disable that test for now.

With this CL, all.bash almost passes. There's some cgo test failing
still, but there's no bug yet, so that can be a separate CL.

Change-Id: I3ffbb0e787ed54cb82f298b6bd5bf3ccfbc82622
Reviewed-on: https://go-review.googlesource.com/41678
Run-TryBot: Brad Fitzpatrick <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
benesch added a commit to benesch/cockroach that referenced this issue Apr 26, 2017
The race detector doesn't work without glibc due to golang/go#14481.

Fixes cockroachdb#15378.
AkihiroSuda pushed a commit to AkihiroSuda/linuxkit that referenced this issue Jun 6, 2017
mattbostock added a commit to mattbostock/timbala that referenced this issue Jul 17, 2017
Enable the Go race detector to help surface race conditions when
running the tests.

The race detector currently depends on libc, so does not work with
Alpine Linux (which uses musl):

golang/go#9918
golang/go#14481

Instead, use the default Go Docker image, which uses the libc-based
Debian Jessie and update the Dockerfiles accordingly.
@dvyukov
Copy link
Member

dvyukov commented Apr 6, 2020

Something changed since last fall. It wasn't that code itself, maybe someone turned ASLR on for ppc64le in some config somewhere?

Humm... yes, it seems this bit of code may actually be needed:

#elif SANITIZER_PPC64V2
  // Disable ASLR for Linux PPC64LE.
  int old_personality = personality(0xffffffff);
  if (old_personality != -1 && (old_personality & ADDR_NO_RANDOMIZE) == 0) {
    VReport(1, "WARNING: Program is being run with address space layout "
               "randomization (ASLR) enabled which prevents the thread and "
               "memory sanitizers from working on powerpc64le.\n"
               "ASLR will be disabled and the program re-executed.\n");
    CHECK_NE(personality(old_personality | ADDR_NO_RANDOMIZE), -1);
    ReExec();
  }

Pulling in ReExec may transitively pull in lots of other other things (as you noted already).

Maybe we could do what freebsd does above if SANITIZER_GO to at least warn users:

  if (UNLIKELY(paxflags & CTL_PROC_PAXFLAGS_ASLR)) {
    Printf("This sanitizer is not compatible with enabled ASLR\n");
    Die();
  }

But I don't know if users can actually turn ASLR off if needed.

Potentially ASLR does not affect static Go binaries at all (?). But maybe it's a problem with cgo.

Hard to say what's best without testing...

@dvyukov
Copy link
Member

dvyukov commented Apr 6, 2020

It would be nice. Pulling new tsan always seems to be a headache. But rewriting tsan in Go is probably a lot of work, many times the aforementioned headache. Probably not worth it.
Unless you were locked in a room with nothing else to do for a month or so...

Yes, that's what I thought as well.
I am actually locked in a room at the moment, but not with nothing else to do :)

@zdjones
Copy link
Contributor

zdjones commented Apr 6, 2020

What is the next-gen algorithm?

@dvyukov
Copy link
Member

dvyukov commented Apr 7, 2020

What is the next-gen algorithm?

It's not described anywhere.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/227867 mentions this issue: runtime/race: rebuild netbsd .syso

@qdm12
Copy link
Contributor

qdm12 commented Apr 15, 2020

Hello all, a big thank you first of all for the work fixing this issue.

Do you have an approximate time when the Go Docker image will contain the fix? I just tried go test -race using the latest golang:1.14-alpine3.11 and it still fails. Thanks!

Or, if it will take quite some time, is there a way for us to manually build and bundle something in the docker image to have it fixed using your latest changes?

@mvdan
Copy link
Member

mvdan commented Apr 15, 2020

@qdm12 this was merged in master, so it will be part of the upcoming Go 1.15 release due in July, unless it gets reverted.

Only important bug fixes are generally backported to stable Go releases. See https://github.com/golang/go/wiki/MinorReleases.

In particular, while this change is awesome and I'd also like to use it today, the change is probably too invasive and risky to backport to a release like 1.14.3. The first 1.15 beta should be out in just six weeks, so you could try the docker images that come out with that release, too.

@graywolf-at-work
Copy link

@qdm12 We are using

#!/bin/sh
set -eu
set -x

cd /

apk --no-cache add build-base git

# Clone the repo and reset to the right commit
git clone https://git.llvm.org/git/compiler-rt.git /x
cd /x

#https://github.com/golang/go/blob/release-branch.go1.14/src/runtime/race/README
git checkout 810ae8ddac890a6613d814c0b5415c7fcb7f5cca

# Apply the patch and do the build
patch -p1 <<'EOF'
commit 51e2fa48690be675363fbd2b62d775715749f58d
Author: Tomas Volf <[email protected]>
Date:   Thu Mar 5 23:43:38 2020 +0100

    Ported musl compatibility patch

diff --git a/lib/sanitizer_common/sanitizer_allocator.cpp b/lib/sanitizer_common/sanitizer_allocator.cpp
index 8d07906cc..e4681f77e 100644
--- a/lib/sanitizer_common/sanitizer_allocator.cpp
+++ b/lib/sanitizer_common/sanitizer_allocator.cpp
@@ -25,7 +25,7 @@ const char *PrimaryAllocatorName = "SizeClassAllocator";
 const char *SecondaryAllocatorName = "LargeMmapAllocator";
 
 // ThreadSanitizer for Go uses libc malloc/free.
-#if SANITIZER_GO || defined(SANITIZER_USE_MALLOC)
+#if defined(SANITIZER_USE_MALLOC)
 # if SANITIZER_LINUX && !SANITIZER_ANDROID
 extern "C" void *__libc_malloc(uptr size);
 #  if !SANITIZER_GO
diff --git a/lib/sanitizer_common/sanitizer_common.cpp b/lib/sanitizer_common/sanitizer_common.cpp
index f5f9f49d8..87efda5bd 100644
--- a/lib/sanitizer_common/sanitizer_common.cpp
+++ b/lib/sanitizer_common/sanitizer_common.cpp
@@ -274,6 +274,7 @@ uptr ReadBinaryNameCached(/*out*/char *buf, uptr buf_len) {
   return name_len;
 }
 
+#if !SANITIZER_GO
 void PrintCmdline() {
   char **argv = GetArgv();
   if (!argv) return;
@@ -282,6 +283,7 @@ void PrintCmdline() {
     Printf("%s ", argv[i]);
   Printf("\n\n");
 }
+#endif
 
 // Malloc hooks.
 static const int kMaxMallocFreeHooks = 5;
diff --git a/lib/sanitizer_common/sanitizer_linux.cpp b/lib/sanitizer_common/sanitizer_linux.cpp
index 0b53da6c3..ac3152eb8 100644
--- a/lib/sanitizer_common/sanitizer_linux.cpp
+++ b/lib/sanitizer_common/sanitizer_linux.cpp
@@ -26,7 +26,7 @@
 #include "sanitizer_placement_new.h"
 #include "sanitizer_procmaps.h"
 
-#if SANITIZER_LINUX
+#if SANITIZER_LINUX && !SANITIZER_GO
 #include <asm/param.h>
 #endif
 
@@ -549,7 +549,7 @@ const char *GetEnv(const char *name) {
 #endif
 }
 
-#if !SANITIZER_FREEBSD && !SANITIZER_NETBSD && !SANITIZER_OPENBSD
+#if !SANITIZER_FREEBSD && !SANITIZER_NETBSD && !SANITIZER_OPENBSD && !SANITIZER_GO
 extern "C" {
 SANITIZER_WEAK_ATTRIBUTE extern void *__libc_stack_end;
 }
@@ -581,7 +581,7 @@ static void ReadNullSepFileToArray(const char *path, char ***arr,
 }
 #endif
 
-#if !SANITIZER_OPENBSD
+#if !SANITIZER_OPENBSD && !SANITIZER_GO
 static void GetArgsAndEnv(char ***argv, char ***envp) {
 #if SANITIZER_FREEBSD
   // On FreeBSD, retrieving the argument and environment arrays is done via the
@@ -1060,7 +1060,7 @@ uptr GetMaxUserVirtualAddress() {
 
 #if !SANITIZER_ANDROID
 uptr GetPageSize() {
-#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__i386__))
+#if SANITIZER_LINUX && defined(EXEC_PAGESIZE)
   return EXEC_PAGESIZE;
 #elif SANITIZER_FREEBSD || SANITIZER_NETBSD
 // Use sysctl as sysconf can trigger interceptors internally.
diff --git a/lib/sanitizer_common/sanitizer_linux_libcdep.cpp b/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
index cd5037182..4e5c04818 100644
--- a/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
+++ b/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
@@ -808,7 +808,7 @@ u64 MonotonicNanoTime() {
 }
 #endif  // SANITIZER_LINUX && !SANITIZER_GO
 
-#if !SANITIZER_OPENBSD
+#if !SANITIZER_OPENBSD && !SANITIZER_GO
 void ReExec() {
   const char *pathname = "/proc/self/exe";
 
diff --git a/lib/tsan/rtl/tsan_platform_linux.cpp b/lib/tsan/rtl/tsan_platform_linux.cpp
index 33fa586ca..4c871ec58 100644
--- a/lib/tsan/rtl/tsan_platform_linux.cpp
+++ b/lib/tsan/rtl/tsan_platform_linux.cpp
@@ -62,7 +62,7 @@
 # undef sa_sigaction
 #endif
 
-#if SANITIZER_FREEBSD
+#if SANITIZER_FREEBSD && !SANITIZER_GO
 extern "C" void *__libc_stack_end;
 void *__libc_stack_end = 0;
 #endif
EOF

cd lib/tsan/go
./buildgo.sh

install -Dt /out/usr/local/go/src/runtime/race/ race_linux_amd64.syso

and then copy it over the original one

COPY --from=compiler-rt       /out /

Or you can just wait for the 1.15 :)

@qdm12
Copy link
Contributor

qdm12 commented Apr 16, 2020

Okay thanks for letting us know 👍 I noticed there is a golang:latest, is this based on the master branch? If so, is there one based on the master branch and alpine perhaps? Anyway, stability/production wise, I'd stick with the release in July 😉

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/231222 mentions this issue: [release-branch.go1.14] runtime/race: update some .syso files

gopherbot pushed a commit that referenced this issue May 4, 2020
Update race detector syso files for some platforms.

There's still 2 more to do, but they might take a while so I'm
mailing the ones I have now.

Note: some arm64 tests did not complete successfully due to out
of memory errors, but I suspect the .syso is correct.

For #14481
For #37485 (I think?)
For #37355
Fixes #38321

Change-Id: I7e7e707a1fd7574855a538ba89dc11acc999c760
Reviewed-on: https://go-review.googlesource.com/c/go/+/226981
Run-TryBot: Ian Lance Taylor <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
(cherry picked from commit 041bcb3)
Reviewed-on: https://go-review.googlesource.com/c/go/+/231222
Reviewed-by: Keith Randall <[email protected]>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/232417 mentions this issue: [release-branch.go1.14] runtime/race: rebuild netbsd .syso

gopherbot pushed a commit that referenced this issue May 6, 2020
For #14481
For #37355
For #38321

Change-Id: Idfceaf0e64d340b7304ce9562549a82ebfc27e3c
Reviewed-on: https://go-review.googlesource.com/c/go/+/227867
Run-TryBot: Keith Randall <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
(cherry picked from commit 3afa741)
Reviewed-on: https://go-review.googlesource.com/c/go/+/232417
Run-TryBot: Ian Lance Taylor <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
martinlindhe added a commit to cloudradar/docker-go-build that referenced this issue Oct 19, 2020
martinlindhe added a commit to cloudradar/docker-go-build that referenced this issue Oct 19, 2020
@bobtista
Copy link

@qdm12 this was merged in master, so it will be part of the upcoming Go 1.15 release due in July, unless it gets reverted.

Only important bug fixes are generally backported to stable Go releases. See https://github.com/golang/go/wiki/MinorReleases.

In particular, while this change is awesome and I'd also like to use it today, the change is probably too invasive and risky to backport to a release like 1.14.3. The first 1.15 beta should be out in just six weeks, so you could try the docker images that come out with that release, too.

Is this confirmed working in golang 1.15? I don't see anything in the release notes, and my google search didn't turn up anything helpful. Thanks!

@bconway
Copy link

bconway commented Oct 19, 2020

Yes, we are using it successfully with the (rolling) 1.15-alpine tag.

@qdm12
Copy link
Contributor

qdm12 commented Oct 19, 2020

I can confirm it works on golang:1.15-alpine, although not on golang:1.15-alpine3.12 for example.

Good job!

@axu2
Copy link

axu2 commented Feb 1, 2021

I can confirm it works on golang:1.15-alpine, although not on golang:1.15-alpine3.12 for example.

Doesn't work for me.

But I added

RUN apk add --update --no-cache build-base

to make it work. Seems similar to this gist: https://gist.github.com/miry/fece267c7faba904c360

If you also need git you can use instead:

RUN apk add --update --no-cache alpine-sdk

@golang golang locked and limited conversation to collaborators Feb 2, 2022
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
tsan while used by golang's race detector was not working on alpine
linux, since it is using musl-c instead of glibc. Since alpine is very
popular distribution for container deployments, having working race
detector would be nice. This commits adds some ifdefs to get it working.

It fixes golang/go#14481 on golang's issue tracker.

Reviewed-in: https://reviews.llvm.org/D75849
Author: graywolf-at-work (Tomas Volf)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests