Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some dynamically linked Golang executables crash on OSv #1141

Closed
wkozaczuk opened this issue May 21, 2021 · 3 comments
Closed

Some dynamically linked Golang executables crash on OSv #1141

wkozaczuk opened this issue May 21, 2021 · 3 comments

Comments

@wkozaczuk
Copy link
Collaborator

wkozaczuk commented May 21, 2021

As I was researching the issue #1137, I have identified that some dynamically linked Golang executables crash. It seems that depending on parameters used with go build slightly different ELF is generated. The article found by @nyh - https://www.arp242.net/static-go.html - provides some interesting insight as to why some of these differences exist. But the devil lies in details.

I have conducted somewhat extensive experiments by trying to build the golang-pie-example and golang-pie-httpserver with various go build parameters using two versions of the Go toolchain - newer 1.12.6 and older 1.15.8. And here is what I have found:

The golang-pie-example works fine on OSv when built with the following build commands:

  • with 1.12.6 and 1.15.8
    • go build -buildmode=pie -ldflags "-linkmode external" -o hello-external-pie hello.go
    • go build -buildmode=exe -ldflags "-linkmode external" -o hello-external-exe hello.go
  • with 1.12.6
    • go build -buildmode=pie -o hello-pie hello.go

With both with 1.12.6 and 1.15.8 the go build -buildmode=exe -o hello-exe hello.go command (the exe is the default), the statically linked position-dependant ELF is built which does NOT work on OSv (hopefully, once we implement #1137, #1138, #1139 and #1140 it will work fine). What is interesting the newer Golang 1.15.8 toolchain produces statically linked position-independent executable (PIE) (or dependent?) when built with go build -buildmode=pie -o hello-pie hello.go (unlike 1.12.6):

file hello-pie
hello-pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=KLddGH5JkygDOUUjd9N8/Bw6msSNbQDLVNmZHbfg9/N-nQvH6Ne9pl5aExWhga/Ncb0ercFgAfBxICpTZba, not stripped
readelf -l hello-pie

Elf file type is DYN (Shared object file)
Entry point 0x465080
There are 12 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000002a0 0x00000000000002a0  R      0x1000
  INTERP         0x0000000000000fe4 0x0000000000400fe4 0x0000000000400fe4
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  NOTE           0x0000000000000f80 0x0000000000400f80 0x0000000000400f80
                 0x0000000000000064 0x0000000000000064  R      0x4
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000000099ed0 0x0000000000099ed0  R E    0x1000
  LOAD           0x000000000009a000 0x000000000049a000 0x000000000049a000
                 0x0000000000072ce5 0x0000000000072ce5  R      0x1000
  LOAD           0x000000000010d000 0x000000000050d000 0x000000000050d000
                 0x0000000000086467 0x0000000000086467  RW     0x1000
  GNU_RELRO      0x000000000010d000 0x000000000050d000 0x000000000050d000
                 0x0000000000086467 0x0000000000086467         0x1000
  LOAD           0x0000000000194000 0x0000000000594000 0x0000000000594000
                 0x0000000000015a00 0x00000000000481a8  RW     0x1000
  DYNAMIC        0x0000000000194040 0x0000000000594040 0x0000000000594040
                 0x00000000000000b0 0x00000000000000b0  RW     0x8
  TLS            0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000008  R      0x8
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x8
  LOOS+0x5041580 0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0x8

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .note.go.buildid 
   03     .text .plt .interp .note.go.buildid 
   04     .rodata .gnu.version_r .rela .rela.plt .dynstr .gnu.version .hash .dynsym 
   05     .data.rel.ro .data.rel.ro.typelink .data.rel.ro.itablink .data.rel.ro.gosymtab .data.rel.ro.gopclntab 
   06     .data.rel.ro .data.rel.ro.typelink .data.rel.ro.itablink .data.rel.ro.gosymtab .data.rel.ro.gopclntab 
   07     .go.buildinfo .got.plt .dynamic .got .noptrdata .data .bss .noptrbss 
   08     .dynamic 
   09     .tbss 
   10     
   11     

readelf -Ws hello-pie

Symbol table '.dynsym' contains 1 entry:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND

and OSv crashes like so:

OSv v0.55.0-247-g1433c08b
eth0: 192.168.122.15
Booted up in 166.35 ms
Cmdline: /hello-pie
syscall(): unimplemented system call 158
Aborted

[backtrace]
0x00000000402178fc <???+1075935484>
0x0000000040342a1a <mmu::vm_fault(unsigned long, exception_frame*)+298>
0x0000000040390d5f <page_fault+143>
0x000000004038fc16 <???+1077476374>
0x0000100000066c9f <???+421023>
0x000000004045f335 <???+1078326069>
0x00000000403fbe19 <thread_main_c+41>
0x0000000040390b92 <???+1077480338>

The fact OSv is reporting missing 158 syscall - arch_prctl - further proves it is a statically linked executable.

The golang-pie-httpserver works fine on OSv (except for what is described in the issue #1047) when built with the following build commands:

  • with 1.12.6 and 1.15.8
    • go build -buildmode=pie -ldflags "-linkmode external" -o httpserver-external-pie httpserver.go
    • go build -buildmode=exe -ldflags "-linkmode external" -o httpserver-external-exe httpserver.go
  • with 1.12.6
    • go build -buildmode=pie -o httpserver-pie httpserver.go

Now unlike golang-pie-example, the binaries produced by other build commands differ in type (the fact that httpserver uses networking- see the article - is the reason):

  • with 1.12.6
file apps/golang-pie-httpserver/httpserver-exe
apps/golang-pie-httpserver/httpserver-exe: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=Z1oGsWACunxdAZjzSLhS/OgjMDZgxohowftiK_OCy/F-JSxJcruWIPYv3ZzzUa/BhZYOoeKhrGpbQNjYniq, not stripped

ldd apps/golang-pie-httpserver/httpserver-exe
	linux-vdso.so.1 (0x00007fff063af000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1e623c1000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1e621d7000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f1e62407000)
  • with 1.15.8
file httpserver-pie
httpserver-pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=wqi_NGgJWOCq4XAYucS5/qChWe9MuPatB9UbNFkPT/Z5civWEUSz6VkriDcQN1/qEjogYs3wMagxBovBpyr, not stripped
ldd httpserver-pie 
	linux-vdso.so.1 (0x00007ffe161fe000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe764ffd000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fe764e32000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fe76503a000)

file httpserver-exe
httpserver-exe: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=lkKfAsZ0NrGY_WQ4CfiJ/NxRPYAKHqp6HQjpjRWBu/mbaz6-34Sh65hOLAl6Ca/Z5uIpWb3cnH_kLVC-v9L, not stripped
ldd httpserver-exe
	linux-vdso.so.1 (0x00007fff5c565000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f07d386f000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f07d36a4000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f07d38ac000)

The crash of httpserver-exe built with 1.12.6 looks like so:

(gdb) bt
#0  0x000000004039f232 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2  osv::halt () at arch/x64/power.cc:26
#3  0x0000000040239ba4 in abort (fmt=fmt@entry=0x4064c93f "Aborted\n") at runtime.cc:133
#4  0x0000000040202ece in abort () at runtime.cc:99
#5  0x000000004021807e in osv::generate_signal (siginfo=..., ef=0xffff8000015fd068) at libc/signal.cc:125
#6  0x000000004046f3d3 in osv::handle_mmap_fault (addr=addr@entry=35193004748800, sig=sig@entry=11, ef=ef@entry=0xffff8000015fd068) at libc/signal.cc:140
#7  0x000000004034745b in mmu::vm_sigsegv (ef=0xffff8000015fd068, addr=35193004748800) at core/mmu.cc:1330
#8  mmu::vm_fault (addr=35193004748800, addr@entry=35193004751248, ef=ef@entry=0xffff8000015fd068) at core/mmu.cc:1350
#9  0x0000000040398b24 in page_fault (ef=0xffff8000015fd068) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x000000000042b2ab in runtime.argv_index (argv=0x200000700fa0, i=1078164798) at /home/wkozaczuk/tools/go/src/runtime/runtime1.go:57
#12 runtime.sysargs (argc=1078164797, argv=0x200000700fa0) at /home/wkozaczuk/tools/go/src/runtime/os_linux.go:206
#13 0x000000000043b3d9 in runtime.args (c=1078164797, v=0x200000700fa0) at /home/wkozaczuk/tools/go/src/runtime/runtime1.go:63
#14 0x00000000004569c0 in runtime.rt0_go () at /home/wkozaczuk/tools/go/src/runtime/asm_amd64.s:193
#15 0x7265737040437d3d in ?? ()
#16 0x0000200000700fa0 in ?? ()
#17 0x0000000040437d3d in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:233
#18 _FUN () at core/app.cc:235

The crash of httpserver-exe built with 1.15.8 looks like so:

#0  0x0000000040397122 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2  osv::halt () at arch/x64/power.cc:26
#3  0x0000000040238280 in abort (fmt=fmt@entry=0x4063f373 "Aborted\n") at runtime.cc:133
#4  0x0000000040202e79 in abort () at runtime.cc:99
#5  0x00000000402178fd in osv::generate_signal (ef=0xffff80000156a068, siginfo=...) at libc/signal.cc:126
#6  osv::handle_mmap_fault (addr=addr@entry=35193004408832, sig=sig@entry=11, ef=ef@entry=0xffff80000156a068) at libc/signal.cc:141
#7  0x0000000040342a1b in mmu::vm_sigsegv (ef=0xffff80000156a068, addr=35193004408832) at core/mmu.cc:1330
#8  mmu::vm_fault (addr=35193004408832, addr@entry=35193004411120, ef=ef@entry=0xffff80000156a068) at core/mmu.cc:1350
#9  0x0000000040390d60 in page_fault (ef=0xffff80000156a068) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x00000000004338ca in runtime.argv_index (argv=0x200000700fa0, i=1078122282) at /usr/lib/golang/src/runtime/runtime1.go:57
#12 runtime.sysargs (argc=1078122281, argv=0x200000700fa0) at /usr/lib/golang/src/runtime/os_linux.go:201
#13 0x0000000000447149 in runtime.args (c=1078122281, v=0x200000700fa0) at /usr/lib/golang/src/runtime/runtime1.go:63
#14 0x000000000046925b in runtime.rt0_go () at /usr/lib/golang/src/runtime/asm_amd64.s:212
#15 0x726573704042d729 in ?? ()
#16 0x0000200000700fa0 in ?? ()
#17 0x000000004042d729 in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:233
#18 _FUN () at core/app.cc:235

The crash of httpserver-pie built with 1.15.8 looks like so:

(gdb) bt
#0  0x0000000040397122 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2  osv::halt () at arch/x64/power.cc:26
#3  0x0000000040238280 in abort (fmt=fmt@entry=0x4063f373 "Aborted\n") at runtime.cc:133
#4  0x0000000040202e79 in abort () at runtime.cc:99
#5  0x00000000402178fd in osv::generate_signal (ef=0xffff800001adb068, siginfo=...) at libc/signal.cc:126
#6  osv::handle_mmap_fault (addr=addr@entry=35193004408832, sig=sig@entry=11, ef=ef@entry=0xffff800001adb068) at libc/signal.cc:141
#7  0x0000000040342a1b in mmu::vm_sigsegv (ef=0xffff800001adb068, addr=35193004408832) at core/mmu.cc:1330
#8  mmu::vm_fault (addr=35193004408832, addr@entry=35193004411120, ef=ef@entry=0xffff800001adb068) at core/mmu.cc:1350
#9  0x0000000040390d60 in page_fault (ef=0xffff800001adb068) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x0000100000038c2c in runtime.argv_index (argv=0x200000700fa0, i=1078122282) at /usr/lib/golang/src/runtime/runtime1.go:57
#12 runtime.sysargs (argc=1078122281, argv=0x200000700fa0) at /usr/lib/golang/src/runtime/os_linux.go:201
#13 0x000010000004c70b in runtime.args (c=1078122281, v=0x200000700fa0) at /usr/lib/golang/src/runtime/runtime1.go:63
#14 0x000010000006eb7f in runtime.rt0_go () at /usr/lib/golang/src/runtime/asm_amd64.s:212
#15 0x726573704042d729 in ?? ()
#16 0x0000200000700fa0 in ?? ()
#17 0x000000004042d729 in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:233
#18 _FUN () at core/app.cc:235

As one can see all 3 crashes look very similar and have something to do with how argv/args is passed to the binary vs what it expects.

Here is relevant Golang code for the 2nd and 3rd crash:

  • /usr/lib/golang/src/runtime/asm_amd64.s:212
 208         MOVL    16(SP), AX              // copy argc
 209         MOVL    AX, 0(SP)
 210         MOVQ    24(SP), AX              // copy argv
 211         MOVQ    AX, 8(SP)
 212         CALL    runtime·args(SB)
 213         CALL    runtime·osinit(SB)
 214         CALL    runtime·schedinit(SB)
  • /usr/lib/golang/src/runtime/runtime1.go:63
 60 func args(c int32, v **byte) {
 61         argc = c
 62         argv = v
 63         sysargs(c, v)
 64 }
  • /usr/lib/golang/src/runtime/os_linux.go:201
197 func sysargs(argc int32, argv **byte) {
198         n := argc + 1
199 
200         // skip over argv, envp to get to auxv
201         for argv_index(argv, n) != nil {
202                 n++
203         }
204 
205         // skip NULL separator
206         n++
  • /usr/lib/golang/src/runtime/runtime1.go:57
54 // nosplit for use in linux startup sysargs
 55 //go:nosplit
 56 func argv_index(argv **byte, i int32) *byte {
 57         return *(**byte)(add(unsafe.Pointer(argv), uintptr(i)*sys.PtrSize))
 58 }

It looks like both argv and argc are passed on the stack?

There is also this relevant code fragment with comments in /usr/lib/golang/src/runtime/asm_amd64.s:

 10 // _rt0_amd64 is common startup code for most amd64 systems when using
  11 // internal linking. This is the entry point for the program from the
  12 // kernel for an ordinary -buildmode=exe program. The stack holds the
  13 // number of arguments and the C-style argv.
  14 TEXT _rt0_amd64(SB),NOSPLIT,$-8
  15         MOVQ    0(SP), DI       // argc
  16         LEAQ    8(SP), SI       // argv
  17         JMP     runtime·rt0_go(SB)
  18 
  19 // main is common startup code for most amd64 systems when using
  20 // external linking. The C startup code will call the symbol "main"
  21 // passing argc and argv in the usual C ABI registers DI and SI.
  22 TEXT main(SB),NOSPLIT,$-8
  23         JMP     runtime·rt0_go(SB)
@nyh
Copy link
Contributor

nyh commented May 24, 2021

The paragraph

  11  // This is the entry point for the program from the
  12 // kernel for an ordinary -buildmode=exe program. The stack holds the
  13 // number of arguments and the C-style argv.

Suggests that Linux has a special way to run a static executable. It doesn't just run a "main()" function normally like a normal function, but rather run some sort of entry point specified in the ELF in a special way (I have to say, I don't know or don't remember any of the details). I am guessing that in this case, Linux passes argv and argc on the stack (the above code suggests offset 16 and 24 on the stack, probably there is something else before?), instead of using the normal C ABI (e.g., using registers). To check if this is true you can look at the Linux source where it does this, or in the C compiler's crt0.s or whatever it is called nowadays which is used to bootstrap a static executable. Or maybe you can find some documentation on this online.

If my guess above is correct (I don't know if it is...), you will need to modify OSv to jump to the static executable's entry point after setting the stack as it expects.

@wkozaczuk
Copy link
Collaborator Author

I wonder if this commit from the static elf branch from 7 years ago is related.

What is interesting, all httpserver executables built with options that make it crash on OSv seem to be dynamically linked per ldd, readelf and how OSv dynamic linker recognizes it. Currently, OSv rejects statically linked executables but maybe we have a bug in that logic.

@wkozaczuk
Copy link
Collaborator Author

All of these types of golang executables - both statically and dynamically linked should work fine after recently adding support for statically linked executables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants