-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: support unwinding logical cgo stack transitions using frame pointers #58378
Comments
@qmuntal I've assigned this to you for now in triage; do you plan to work on it? (Feel free to unassign and leave a "help wanted" tag.) |
Yep. In fact I've already submitted a bunch of CLs (not merged yet) that will make this issue easier to implement (although their are valuable on their own). Will update them referencing this issue. |
Change https://go.dev/cl/466835 mentions this issue: |
Change https://go.dev/cl/466396 mentions this issue: |
Change https://go.dev/cl/466457 mentions this issue: |
Change https://go.dev/cl/466455 mentions this issue: |
Change https://go.dev/cl/466316 mentions this issue: |
Change https://go.dev/cl/466458 mentions this issue: |
Change https://go.dev/cl/466395 mentions this issue: |
Change https://go.dev/cl/466355 mentions this issue: |
Change https://go.dev/cl/466456 mentions this issue: |
This CL marks some linux assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #58378 Change-Id: I7792cff4f6e539bfa56c02868f2965088ca1975a Reviewed-on: https://go-review.googlesource.com/c/go/+/466316 Reviewed-by: Than McIntosh <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Run-TryBot: Quim Muntal <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
This CL marks some dragonfly assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #58378 Change-Id: I832a1a78d68a49f11df3b03fa9d50d4796bcac03 Reviewed-on: https://go-review.googlesource.com/c/go/+/466355 Run-TryBot: Quim Muntal <[email protected]> Reviewed-by: Than McIntosh <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
This CL marks some freebsd assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #58378 Change-Id: Ibd00748946f1137e165293df7da73278cb673bbd Reviewed-on: https://go-review.googlesource.com/c/go/+/466395 Reviewed-by: Than McIntosh <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Run-TryBot: Quim Muntal <[email protected]>
This CL marks some netbsd assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. While here, and thanks to CL 466355, `asm_netbsd_amd64.s` can be deleted in favor of `asm9_unix2_amd64.s`, which makes better use of the frame pointer. Updates #58378 Change-Id: Iff554b664ec25f2bb6ec198c0f684590b359c383 Reviewed-on: https://go-review.googlesource.com/c/go/+/466396 Reviewed-by: Than McIntosh <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Run-TryBot: Quim Muntal <[email protected]>
This CL marks some openbsd assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #58378 Change-Id: I993549df41a93255fb714357443f8b24c3dfb0a4 Reviewed-on: https://go-review.googlesource.com/c/go/+/466455 Reviewed-by: Keith Randall <[email protected]> Run-TryBot: Quim Muntal <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Joel Sing <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
This CL marks some solaris assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. While here, I've reduced the stack usage of runtime·sigtramp by 16 bytes to compensate the additional 8 bytes from the stack-allocated frame pointer. There were two unused 8-byte slots on the stack, one at 24(SP) and the other at 80(SP). Updates #58378 Change-Id: If9230e71a8b3c72681ffc82030ade6ceccf824db Reviewed-on: https://go-review.googlesource.com/c/go/+/466456 Reviewed-by: Than McIntosh <[email protected]> Reviewed-by: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Quim Muntal <[email protected]>
Change https://go.dev/cl/472195 mentions this issue: |
👋 It's great to see your CL above. I'm currently working on adding frame pointer unwind support to the execution tracer (much faster than gentraceback), see #16638 . |
It was been quite a journey getting to that CL 🥲, glad that it also helps other effords other than my goal of enabling SEH unwinding on Windows. Now that I almost managed to get rid of the implicit NOFRAME rule, it would be easier to have frame pointers in the runtime functions. |
This CL removes some NOFRAME flags on Windows assembly files for several reasons: - windows/386 does not use a frame pointer - Leaf frameless functions already skip the frame pointer - Some non-leaf functions do not contain enough dragons to justify not using the frame pointer Updates #58378 Change-Id: I31e71bf7f769e1957a4adba91778da5af66ce1e4 Reviewed-on: https://go-review.googlesource.com/c/go/+/466835 Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Keith Randall <[email protected]> Run-TryBot: Quim Muntal <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
This CL marks some plan9 assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #58378 Change-Id: Ic8c9ab5c1a0897bebc6c1419ddc903a7492a1b0a Reviewed-on: https://go-review.googlesource.com/c/go/+/466457 TryBot-Bypass: Quim Muntal <[email protected]> Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
All amd64 OSes already make use of the NOFRAME flag wherever is required, so we can remove the frameless nosplit functions heuristic code path. Updates #58378 Change-Id: I966970693ba07f8c66da0aca83c23caad7cbbfe5 Reviewed-on: https://go-review.googlesource.com/c/go/+/466458 Reviewed-by: Cherry Mui <[email protected]> Run-TryBot: Quim Muntal <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Than McIntosh <[email protected]>
Please consider a release note about that semantic change that may surprise assembler code outside of the tree. @qmuntal |
Sure @aclements I'm having problems on CL 472195 trying to get |
FWIW, I'm still not convinced that saving frame pointers is a good idea for stack transition functions like |
Your opinion worth alot 😸. What I don't understand is what could go wrong when saving the frame pointer on functions that change the stack. And, if there are caveats, what could be worse than not having any stack unwinding at all? |
I'm not sure if it could go wrong, just that as the stack transition is special enough I think it is clearer that we do it more explicitly, with comments. It may also be simpler if we save the FP only when the stack transition actually happens, and not on the tail call case. For the traceback failure (I didn't read the code or the failure message, just guessing), could it be that the frame layout of |
I remembered that one of the problems we had with delve's stack unwinder was the morestack/newstack path. Maybe you have already accounted for it, I thought it mention it just in case. |
Not sure I follow. Are you talking about some sort of compiler directive like this? //go:emitfp
systemstack(func() { ... }) Sorry if this makes no sense, just trying to make sure I understand your comments 😅 .
FWIW, while working on CL 463835 I found that frame pointer unwinding through systemstack seems to be working already because systemstack doesn't clobber the |
Thanks for the advice, the frame layout of
Have the same questions as @felixge, plus one data point more: arm64 versions of
I tried this at some point, but it would create another (and probably bigger) issue: the FP would be stored outside the function prologue, which is non-standard. Doing so will make it difficult to programmatically generate DWARF unwinding info using the FP instead of stack deltas, and AFAIK will not work on Windows with SEH unwinding info, which requires the FP to be on the prologue, and the prologue is limited to 255 bytes. |
No. I was just thinking explicitly saving FP in the assembly code. If you all prefer the implicit saving, that is fine. But I think somewhere in the code we should have a comment mentioning that the frame pointer is chained through the stack transition.
Not sure if that is actually correct (https://golang.org/cl/241080). It does do the implicit saving, though. |
Ah, got it! I don't have a preference on that. If it's better to do this "by hand" in the assembly rather than letting the compiler generate it, that works for me. As long as the frame pointer gets pushed I'm happy 😄. |
This CL removes the NOFRAME flag from runtime.asmcgocall, runtime.systemstack and runtime.mcall so the compiler can place the frame pointer on the stack. This will help unwinding cgo stack frames, and might be all what's needed for tools that only use the frame pointer to unwind the stack. That's not the case for gdb, which uses DWARF CFI, and windbg, which uses SEH. Yet, having the frame pointer correctly set lays the foundation for supporting cgo unwinding with DWARF CFI and SEH. Updates #58378 Change-Id: I7655363b3fb619acccd9d5a7f0e3d3dec953cd52 Reviewed-on: https://go-review.googlesource.com/c/go/+/472195 Run-TryBot: Quim Muntal <[email protected]> Auto-Submit: Michael Pratt <[email protected]> Reviewed-by: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Michael Pratt <[email protected]>
Closing this issue as done. There has been many CLs during go1.21 dev cycle to improve the frame pointer unwinding through the cgo transition, and there are even some tests verifying that there are no regressions in this area, i.e. @felixge's CL 474916. Thanks everyone for the discussion and the reviews! |
Change https://go.dev/cl/501516 mentions this issue: |
For #58378 Change-Id: I960b97f33a8bf29d3a9622b58d278544d0970a38 Reviewed-on: https://go-review.googlesource.com/c/go/+/501516 Reviewed-by: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Quim Muntal <[email protected]> Reviewed-by: David Chase <[email protected]>
Note: follow-up issue for #40044 (comment)
When using generic third-party debuggers (e.g. dbg or windbg) to debug a amd64 Go binary, it is currently not possible to know which piece of Go code started a cgo call once
runtime.asmcgocall
has switched from the Go routine stack to the system stack. This is becauseasmcgocall
is marked asNOFRAME
, so the frame pointer is not set on the function prologue, confusing the debugger unwinding process.Notice that this is only true for amd64 binaries, as all other architectures supporting frame pointers do not mark
asmcgocall
asNOFRAME
. Yet, at least windows/arm64 neither supports cgo stack unwinding becauseasmcgocall
callsasmstdcall
, which is marked asNOFRAME
.IMO we should at least make sure that
asmcgocall
and related-auxiliary functions can be unwound from the system stack up to the Go stack by placing the right frame pointer in their prologue. I'm leaving out of this discussion other functions that perform a system stack on purpose, as they are not so fundamental asasmcgocall
and correctly setting the frame pointer of low-level assembly functions is not straight-forward.I'm particularly interested on this because on Windows, WinDbg uses the frame pointer to unwind the stack (see #57302) to unwound the stack of a debugged binary or crash minidump (#57441, #49471, #20498).
It would also help fixing at least #40044 and #57698.
@golang/runtime @golang/compiler
The text was updated successfully, but these errors were encountered: