Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: netpoll failed on darwin #59679

Closed
bcmills opened this issue Apr 17, 2023 · 13 comments
Closed

runtime: netpoll failed on darwin #59679

bcmills opened this issue Apr 17, 2023 · 13 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Apr 17, 2023

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`

(Pulled out from #34988 (comment); CC @ianlancetaylor @golang/darwin)

@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 17, 2023
@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-04-04 16:58 darwin-amd64-nocgo go@3991f6c4 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0x11af512?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1047 +0x5d fp=0x7ff7bfefec68 sp=0x7ff7bfefec38 pc=0x1034afd
runtime.netpoll(0x1054c74?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:134 +0x38e fp=0x7ff7bfeff4f8 sp=0x7ff7bfefec68 pc=0x103160e
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:2759 +0x2af fp=0x7ff7bfeff600 sp=0x7ff7bfeff4f8 pc=0x103c64f
...
	/tmp/buildlet/go/src/os/exec.go:109 +0x5a fp=0xc000173d28 sp=0xc000173ce0 pc=0x108a31a
os/exec.(*Cmd).Start(0xc00012a2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:693 +0x5ee fp=0xc000173ea0 sp=0xc000173d28 pc=0x10e256e
os/exec.(*Cmd).Run(0x11435a0?)
	/tmp/buildlet/go/src/os/exec/exec.go:587 +0x1e fp=0xc000173ec0 sp=0xc000173ea0 pc=0x10e1f3e
os/exec.(*Cmd).CombinedOutput(0xc00012a2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:1005 +0xa8 fp=0xc000173ee8 sp=0xc000173ec0 pc=0x10e3da8
runtime/pprof.TestCPUProfileWithFork(0xc000083380)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:668 +0x194 fp=0xc000173f70 sp=0xc000173ee8 pc=0x11530b4
testing.tRunner(0xc000083380, 0x11bbfb0)

watchflakes

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin labels Apr 17, 2023
@ianlancetaylor
Copy link
Contributor

> runtime: kevent on fd 3 failed with 60
> fatal error: runtime: netpoll failed

Error 60 is ETIMEDOUT. This shouldn't happen, and is not listed on the man page. When a call to kevent reaches the timeout, it is documented to return 0.

It would be easy for us to ignore a ETIMEDOUT error but it shouldn't be necessary.

@mknyszek mknyszek added this to the Backlog milestone Apr 19, 2023
@mknyszek
Copy link
Contributor

In triage, we think this is either a bug in Darwin or a bug in the Darwin documentation. @prattmic thinks maybe an error is bubbling up from somewhere deeper, in which case maybe we just ignore it like @ianlancetaylor suggested.

@mknyszek
Copy link
Contributor

We agree we will wait until it fails again to make a decision.

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-08-22 20:43 darwin-amd64-nocgo go@98195ed0 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0x159dffd?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ff7bfad8bd8 sp=0x7ff7bfad8ba8 pc=0x145bbbc
runtime.netpoll(0xc00002ef00?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ff7bfad9478 sp=0x7ff7bfad8bd8 pc=0x1458de5
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:3208 +0x851 fp=0x7ff7bfad95a0 sp=0x7ff7bfad9478 pc=0x1463d31
...
	/tmp/buildlet/go/src/os/exec.go:111 +0x54 fp=0xc000061d08 sp=0xc000061cc0 pc=0x14b2f94
os/exec.(*Cmd).Start(0xc00018a2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:665 +0x5cb fp=0xc000061e80 sp=0xc000061d08 pc=0x150994b
os/exec.(*Cmd).Run(0xc000061f38?)
	/tmp/buildlet/go/src/os/exec/exec.go:587 +0x18 fp=0xc000061ea0 sp=0xc000061e80 pc=0x1509338
os/exec.(*Cmd).CombinedOutput(0xc00018a2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:977 +0x94 fp=0xc000061ec8 sp=0xc000061ea0 pc=0x150af94
runtime/pprof.TestCPUProfileWithFork(0xc000502ea0)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc000061f70 sp=0xc000061ec8 pc=0x157ef58
testing.tRunner(0xc000502ea0, 0x16230d0)

watchflakes

@bcmills
Copy link
Contributor Author

bcmills commented Aug 23, 2023

Ooh, neat! And in TestCPUProfileWithFork again.

@dr2chase, maybe you have some insights after your darwin SIGPROF investigation? 🤔

@panjf2000
Copy link
Member

panjf2000 commented Aug 23, 2023

Since this failure always happened during runtime/pprof.TestCPUProfileWithFork.

The ETIMEDOUT error from kevent in this context can be related to a variety of factors (just some thoughts, not quite sure):

  • Profiling Overhead: CPU profiling adds overhead. When combined with the process of forking, there might be a scenario where the system is unable to complete an operation in a timely manner, leading to the timeout.
  • Concurrency Issues: Forking in a multithreaded environment (like a Go program) can be tricky. If there are race conditions or other concurrency issues, it might cause unexpected behaviors.
  • Resource Limitations: As with any stress test, if the system is resource-constrained, it could contribute to timeouts.
  • File Descriptors and Locks: Forking can duplicate file descriptors, and if there's any form of locking mechanism (like for files being used in the profiling), it could potentially cause delays or deadlocks, leading to timeouts.

Furthermore, I found some old links where people came across that error of kevent:

I think before we dive deep and figure out what really happened there, ignoring ETIMEOUT is a conservative but sufficient solution.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/522215 mentions this issue: runtime: ignore ETIMEDOUT for kevent conservatively

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-08-31 16:31 darwin-amd64-11_0 go@cab416f3 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0x27dfc7d?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ffeee599bd8 sp=0x7ffeee599ba8 pc=0x269ab3c
runtime.netpoll(0xc000026000?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ffeee59a478 sp=0x7ffeee599bd8 pc=0x2697d65
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:3208 +0x851 fp=0x7ffeee59a5a0 sp=0x7ffeee59a478 pc=0x26a2cb1
...
	/tmp/buildlet/go/src/os/exec/exec.go:928 +0x151 fp=0xc00006ce20 sp=0xc00006cd50 pc=0x2749131
os/exec.(*Cmd).Wait(0xc000160420)
	/tmp/buildlet/go/src/os/exec/exec.go:880 +0x16c fp=0xc00006ce80 sp=0xc00006ce20 pc=0x2748eac
os/exec.(*Cmd).Run(0xc00006cf38?)
	/tmp/buildlet/go/src/os/exec/exec.go:590 +0x2d fp=0xc00006cea0 sp=0xc00006ce80 pc=0x2747a0d
os/exec.(*Cmd).CombinedOutput(0xc000160420)
	/tmp/buildlet/go/src/os/exec/exec.go:977 +0x85 fp=0xc00006cec8 sp=0xc00006cea0 pc=0x2749645
runtime/pprof.TestCPUProfileWithFork(0xc000007a00)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc00006cf70 sp=0xc00006cec8 pc=0x27c0dd8
testing.tRunner(0xc000007a00, 0x28652b0)
2023-09-07 19:09 darwin-amd64-nocgo go@9a9aa6e2 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0x9b11b43?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ff7b7567bd0 sp=0x7ff7b7567ba0 pc=0x99ccbbc
runtime.netpoll(0xc000026000?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ff7b7568470 sp=0x7ff7b7567bd0 pc=0x99c9de5
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:3208 +0x851 fp=0x7ff7b7568598 sp=0x7ff7b7568470 pc=0x99d4d31
...
	/tmp/buildlet/go/src/os/exec.go:111 +0x54 fp=0xc0004f2d08 sp=0xc0004f2cc0 pc=0x9a23f54
os/exec.(*Cmd).Start(0xc0004f6160)
	/tmp/buildlet/go/src/os/exec/exec.go:665 +0x5cb fp=0xc0004f2e80 sp=0xc0004f2d08 pc=0x9a79d2b
os/exec.(*Cmd).Run(0xc0004f2f38?)
	/tmp/buildlet/go/src/os/exec/exec.go:587 +0x18 fp=0xc0004f2ea0 sp=0xc0004f2e80 pc=0x9a79718
os/exec.(*Cmd).CombinedOutput(0xc0004f6160)
	/tmp/buildlet/go/src/os/exec/exec.go:977 +0x85 fp=0xc0004f2ec8 sp=0xc0004f2ea0 pc=0x9a7b365
runtime/pprof.TestCPUProfileWithFork(0xc0005809c0)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc0004f2f70 sp=0xc0004f2ec8 pc=0x9af2af8
testing.tRunner(0xc0005809c0, 0x9b971b0)

watchflakes

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-09-25 15:24 darwin-amd64-nocgo go@b7e0dfc4 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0xb669c1e?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ff7b5a0dbd0 sp=0x7ff7b5a0dba0 pc=0xb526d7c
runtime.netpoll(0xc00002ca00?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ff7b5a0e470 sp=0x7ff7b5a0dbd0 pc=0xb523fa5
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:2964 +0x30c fp=0x7ff7b5a0e598 sp=0x7ff7b5a0e470 pc=0xb52e8ec
...
	/tmp/buildlet/go/src/os/exec.go:111 +0x54 fp=0xc0002eacf8 sp=0xc0002eacb0 pc=0xb57deb4
os/exec.(*Cmd).Start(0xc000196160)
	/tmp/buildlet/go/src/os/exec/exec.go:700 +0x5f6 fp=0xc0002eae80 sp=0xc0002eacf8 pc=0xb5d3e36
os/exec.(*Cmd).Run(0xc0002eaf38?)
	/tmp/buildlet/go/src/os/exec/exec.go:604 +0x18 fp=0xc0002eaea0 sp=0xc0002eae80 pc=0xb5d37f8
os/exec.(*Cmd).CombinedOutput(0xc000196160)
	/tmp/buildlet/go/src/os/exec/exec.go:1012 +0x85 fp=0xc0002eaec8 sp=0xc0002eaea0 pc=0xb5d54a5
runtime/pprof.TestCPUProfileWithFork(0xc0006029c0)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc0002eaf70 sp=0xc0002eaec8 pc=0xb64aa98
testing.tRunner(0xc0006029c0, 0xb6ee670)

watchflakes

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-09-27 15:58 darwin-amd64-nocgo go@6cf6067d runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0xe437c1e?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ff7b2c3fbd8 sp=0x7ff7b2c3fba8 pc=0xe2f4d7c
runtime.netpoll(0xc000026000?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ff7b2c40478 sp=0x7ff7b2c3fbd8 pc=0xe2f1fa5
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:3195 +0x851 fp=0x7ff7b2c405a0 sp=0x7ff7b2c40478 pc=0xe2fce31
...
	/tmp/buildlet/go/src/os/exec.go:111 +0x54 fp=0xc00006dcf8 sp=0xc00006dcb0 pc=0xe34beb4
os/exec.(*Cmd).Start(0xc00015c160)
	/tmp/buildlet/go/src/os/exec/exec.go:700 +0x5f6 fp=0xc00006de80 sp=0xc00006dcf8 pc=0xe3a1e36
os/exec.(*Cmd).Run(0xc00006df38?)
	/tmp/buildlet/go/src/os/exec/exec.go:604 +0x18 fp=0xc00006dea0 sp=0xc00006de80 pc=0xe3a17f8
os/exec.(*Cmd).CombinedOutput(0xc00015c160)
	/tmp/buildlet/go/src/os/exec/exec.go:1012 +0x85 fp=0xc00006dec8 sp=0xc00006dea0 pc=0xe3a34a5
runtime/pprof.TestCPUProfileWithFork(0xc000580b60)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc00006df70 sp=0xc00006dec8 pc=0xe418a98
testing.tRunner(0xc000580b60, 0xe4bc670)

watchflakes

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-10-03 06:40 darwin-amd64-11_0 go@352c8835 runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0xee44a6a?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ffee1f32bd0 sp=0x7ffee1f32ba0 pc=0xed01c1c
runtime.netpoll(0xc000028000?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ffee1f33470 sp=0x7ffee1f32bd0 pc=0xecfee45
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:2964 +0x30c fp=0x7ffee1f33598 sp=0x7ffee1f33470 pc=0xed0978c
...
	/tmp/buildlet/go/src/os/exec/exec.go:960 +0xe5 fp=0xc0000a1e20 sp=0xc0000a1d50 pc=0xedafd65
os/exec.(*Cmd).Wait(0xc0000ce580)
	/tmp/buildlet/go/src/os/exec/exec.go:915 +0x16c fp=0xc0000a1e80 sp=0xc0000a1e20 pc=0xedafb4c
os/exec.(*Cmd).Run(0xc0000a1f38?)
	/tmp/buildlet/go/src/os/exec/exec.go:607 +0x2d fp=0xc0000a1ea0 sp=0xc0000a1e80 pc=0xedae64d
os/exec.(*Cmd).CombinedOutput(0xc0000ce580)
	/tmp/buildlet/go/src/os/exec/exec.go:1012 +0x85 fp=0xc0000a1ec8 sp=0xc0000a1ea0 pc=0xedb02e5
runtime/pprof.TestCPUProfileWithFork(0xc0005809c0)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc0000a1f70 sp=0xc0000a1ec8 pc=0xee258d8
testing.tRunner(0xc0005809c0, 0xeec9670)

watchflakes

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "darwin" && log ~ `^fatal error: runtime: netpoll failed`
2023-10-09 22:06 darwin-amd64-12_0 go@51bdd3bd runtime/pprof.TestCPUProfileWithFork (log)
runtime: kevent on fd 3 failed with 60
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw({0x92c800a?, 0x0?})
	/tmp/buildlet/go/src/runtime/panic.go:1018 +0x5c fp=0x7ff7b7db0bd8 sp=0x7ff7b7db0ba8 pc=0x918427c
runtime.netpoll(0xc000026000?)
	/tmp/buildlet/go/src/runtime/netpoll_kqueue.go:145 +0x385 fp=0x7ff7b7db1478 sp=0x7ff7b7db0bd8 pc=0x9181485
runtime.findRunnable()
	/tmp/buildlet/go/src/runtime/proc.go:3195 +0x851 fp=0x7ff7b7db15a0 sp=0x7ff7b7db1478 pc=0x918c311
runtime.schedule()
	/tmp/buildlet/go/src/runtime/proc.go:3589 +0xb1 fp=0x7ff7b7db15d8 sp=0x7ff7b7db15a0 pc=0x918d5b1
runtime.goexit0(0xc00013c4e0)
	/tmp/buildlet/go/src/runtime/proc.go:3932 +0x24e fp=0x7ff7b7db1618 sp=0x7ff7b7db15d8 pc=0x918e4ae
traceback: unexpected SPWRITE function runtime.mcall
runtime.mcall()
	/tmp/buildlet/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7ff7b7db1630 sp=0x7ff7b7db1618 pc=0x91b6dae

runtime.makechan(0x9319580?, 0x1?)
	/tmp/buildlet/go/src/runtime/chan.go:72 +0x1c8 fp=0xc000069d00 sp=0xc000069cf8 pc=0x9153708
os/exec.(*Cmd).Start(0xc0000de2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:713 +0x657 fp=0xc000069e88 sp=0xc000069d00 pc=0x9231c57
os/exec.(*Cmd).Run(0xc0000de2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:604 +0x18 fp=0xc000069ea0 sp=0xc000069e88 pc=0x92315b8
os/exec.(*Cmd).CombinedOutput(0xc0000de2c0)
	/tmp/buildlet/go/src/os/exec/exec.go:1012 +0x85 fp=0xc000069ec8 sp=0xc000069ea0 pc=0x9233225
runtime/pprof.TestCPUProfileWithFork(0xc00013c000)
	/tmp/buildlet/go/src/runtime/pprof/pprof_test.go:691 +0x2f8 fp=0xc000069f70 sp=0xc000069ec8 pc=0x92a8e78
testing.tRunner(0xc00013c000, 0x934c7d0)

watchflakes

@github-project-automation github-project-automation bot moved this from Todo to Done in Go Compiler / Runtime Oct 23, 2023
@golang golang locked and limited conversation to collaborators Oct 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
Archived in project
Development

No branches or pull requests

5 participants