Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime mlock of signal stack failed on on long-running API #1809

Closed
RobertLucian opened this issue Jan 20, 2021 · 1 comment
Closed

Runtime mlock of signal stack failed on on long-running API #1809

RobertLucian opened this issue Jan 20, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@RobertLucian
Copy link
Member

RobertLucian commented Jan 20, 2021

Description

Seems to have been reported on golang/go#37436.
This still appears to happen on 1.14.7 of Go.

Possibly an OOM error which may have been fixed by #1806.
Or a problem with mlock's limit, which may be too low and has to be increased.
Maybe we can mitigate this by using GODEBUG=asyncpreemptoff=1, but sounds like a bad idea.

More on this bug:
https://github.com/golang/go/wiki/LinuxKernelSignalVectorBug

Stack traces

runtime: mlock of signal stack failed: 12
runtime: increase the mlock limit (ulimit -l) or
runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
fatal error: mlock failed

runtime stack:
runtime.throw(0x7f6ae1a05eb6, 0xc)
    /root/.go/src/runtime/panic.go:1112 +0x74
runtime.mlockGsignal(0xc000582a80)
    /root/.go/src/runtime/os_linux_x86.go:72 +0x109
runtime.mpreinit(0xc00003d180)
    /root/.go/src/runtime/os_linux.go:341 +0x7a
runtime.mcommoninit(0xc00003d180)
    /root/.go/src/runtime/proc.go:630 +0x10c
runtime.allocm(0x0, 0x0, 0x7f6a46ffb020)
    /root/.go/src/runtime/proc.go:1390 +0x152
runtime.oneNewExtraM()
    /root/.go/src/runtime/proc.go:1529 +0x2f
runtime.newextram()
    /root/.go/src/runtime/proc.go:1517 +0x83
runtime.systemstack(0x7f6a46ffb070)
    /root/.go/src/runtime/asm_amd64.s:370 +0x63
runtime.mstart()
    /root/.go/src/runtime/proc.go:1041

goroutine 52 [running, locked to thread]:
runtime.systemstack_switch()
    /root/.go/src/runtime/asm_amd64.s:330 fp=0xc0003e0ec0 sp=0xc0003e0eb8 pc=0x7f6ae19ca940
runtime.cgocallbackg1(0x0)
    /root/.go/src/runtime/cgocall.go:226 +0x292 fp=0xc0003e0f58 sp=0xc0003e0ec0 pc=0x7f6ae1973ef2
runtime.cgocallbackg(0x0)
    /root/.go/src/runtime/cgocall.go:207 +0xc7 fp=0xc0003e0fc0 sp=0xc0003e0f58 pc=0x7f6ae1973bc7
runtime.cgocallback_gofunc(0x0, 0x0, 0x0, 0x0)
    /root/.go/src/runtime/asm_amd64.s:793 +0x9a fp=0xc0003e0fe0 sp=0xc0003e0fc0 pc=0x7f6ae19cc2ea
runtime.goexit()
    /root/.go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc0003e0fe8 sp=0xc0003e0fe0 pc=0x7f6ae19cca81

goroutine 17 [runnable, locked to thread]:
runtime.goexit()
    /root/.go/src/runtime/asm_amd64.s:1373 +0x1

goroutine 18 [running, locked to thread]:
    goroutine running on other thread; stack unavailable

goroutine 51 [running, locked to thread]:
    goroutine running on other thread; stack unavailable
2021-01-19 14:53:18.859486:cortex:pid-484:INFO:200 OK POST /
2021-01-19 14:53:19.028038:cortex:pid-489:INFO:Shutting down
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
2021-01-19 14:53:19.049094:cortex:pid-491:INFO:Shutting down
2021-01-19 14:53:19.049582:cortex:pid-480:INFO:Shutting down
2021-01-19 14:53:19.050100:cortex:pid-481:INFO:Shutting down
2021-01-19 14:53:19.053149:cortex:pid-486:INFO:Shutting down
2021-01-19 14:53:19.063092:cortex:pid-484:INFO:Shutting down
2021-01-19 14:53:19.120287:cortex:pid-487:INFO:Shutting down
2021-01-19 14:53:19.128448:cortex:pid-489:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.128732:cortex:pid-489:INFO:Application shutdown complete.
2021-01-19 14:53:19.128812:cortex:pid-489:INFO:Finished server process [489]
2021-01-19 14:53:19.149451:cortex:pid-491:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.149707:cortex:pid-491:INFO:Application shutdown complete.
2021-01-19 14:53:19.149783:cortex:pid-491:INFO:Finished server process [491]
2021-01-19 14:53:19.149929:cortex:pid-480:INFO:Waiting for background tasks to complete. (CTRL+C to force quit)
2021-01-19 14:53:19.150400:cortex:pid-481:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.150645:cortex:pid-481:INFO:Application shutdown complete.
2021-01-19 14:53:19.150716:cortex:pid-481:INFO:Finished server process [481]
2021-01-19 14:53:19.153521:cortex:pid-486:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.153765:cortex:pid-486:INFO:Application shutdown complete.
2021-01-19 14:53:19.153839:cortex:pid-486:INFO:Finished server process [486]
2021-01-19 14:53:19.163446:cortex:pid-484:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.163679:cortex:pid-484:INFO:Application shutdown complete.
2021-01-19 14:53:19.163746:cortex:pid-484:INFO:Finished server process [484]
2021-01-19 14:53:19.220661:cortex:pid-487:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.220978:cortex:pid-487:INFO:Application shutdown complete.
2021-01-19 14:53:19.221060:cortex:pid-487:INFO:Finished server process [487]
2021-01-19 14:53:19.350197:cortex:pid-480:INFO:Waiting for application shutdown.
2021-01-19 14:53:19.350410:cortex:pid-480:INFO:Application shutdown complete.
2021-01-19 14:53:19.350493:cortex:pid-480:INFO:Finished server process [480]
s6-svwait: fatal: timed out
[s6-finish] sending all processes the TERM signal.
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
[s6-finish] sending all processes the KILL signal and exiting.

Context

As reported by @ratovarius.

@RobertLucian RobertLucian added the bug Something isn't working label Jan 20, 2021
@RobertLucian
Copy link
Member Author

Only affected a user running their API using the local provider.

The docker binary was where this panic came from and the root cause is explained here https://github.com/golang/go/wiki/LinuxKernelSignalVectorBug. To solve this, the user has to upgrade their OS' kernel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant