-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libspl/backtrace: dump registers in libunwind backtraces #16653
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. We can continue to incrementally improve this as needed.
FYI, I managed to try this on a RPi4b this morning:
So I suspect I'm holding something slightly wrong, but also I may not have a chance to get back to this for a few days. It's not the worst thing, so probably still nbd to merge but if I get a chance before you do I'll look into it. |
Thanks for checking this elsewhere. We'll just hold of merging this until you have a chance to sort that out. |
cff1e7c
to
7902a15
Compare
More useful stuff, especially when trying to follow a disassembly. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
My eyes are going blurry looking at all those write calls. This is much nicer. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
This is the sort of code that we get right once and never look at again. Anyone reading this code is already likely in the middle of a debugging nightmare, and then they have a wall of manual string construction and an unfamiliar and idiosyncratic library to deal with. So, comment the whole thing to try to make it clear what's going on. In pursuit of the above, I've added return checks to some of the libunwind calls, fixed the frame loop to not skip the "top" frame (however unseful it may be), and fix a couple of calls to spl_bt_u64_to_hex_str() which requested 18 digits instead of 16. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
Alright, sorted: was holding it wrong (libunwind is kinda weird, but also quite powerful). Turns out register names are just a static lookup table by enum. Register existence is more about whether or not you can actually get the value for it - not every implementation of an architecture has every register. So now we're checking error codes and skipping registers if we couldn't get a value. But, libunwind doesn't have a hardcoded name for every register it knows about, so for those, we have to show something. They're pretty niche though; the only one I have nearby is the ARM64 "PSTATE" register, which is some synthetic thing based on certain status flags (idk, I didn't look). So while we're not naming it perfectly, we can at least show it. Put it together:
But there's more! The soup of And that was worth it, because I found half a dozen small bugs and quirks. Nothing dangerous, but there because it was so hard to see! I now have designs on a more comprehensive crash output, especially for assert/verify which then raises a signal, but I'll do that in a later PR, because it touches more stuff than just this file. |
7902a15
to
bbec61d
Compare
Very nice! |
My eyes are going blurry looking at all those write calls. This is much nicer. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Close #16653
Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #16653
This is the sort of code that we get right once and never look at again. Anyone reading this code is already likely in the middle of a debugging nightmare, and then they have a wall of manual string construction and an unfamiliar and idiosyncratic library to deal with. So, comment the whole thing to try to make it clear what's going on. In pursuit of the above, I've added return checks to some of the libunwind calls, fixed the frame loop to not skip the "top" frame (however unseful it may be), and fix a couple of calls to spl_bt_u64_to_hex_str() which requested 18 digits instead of 16. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #16653
More useful stuff, especially when trying to follow a disassembly. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
My eyes are going blurry looking at all those write calls. This is much nicer. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Close openzfs#16653
Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
This is the sort of code that we get right once and never look at again. Anyone reading this code is already likely in the middle of a debugging nightmare, and then they have a wall of manual string construction and an unfamiliar and idiosyncratic library to deal with. So, comment the whole thing to try to make it clear what's going on. In pursuit of the above, I've added return checks to some of the libunwind calls, fixed the frame loop to not skip the "top" frame (however unseful it may be), and fix a couple of calls to spl_bt_u64_to_hex_str() which requested 18 digits instead of 16. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
More useful stuff, especially when trying to follow a disassembly. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
My eyes are going blurry looking at all those write calls. This is much nicer. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Close openzfs#16653
Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
This is the sort of code that we get right once and never look at again. Anyone reading this code is already likely in the middle of a debugging nightmare, and then they have a wall of manual string construction and an unfamiliar and idiosyncratic library to deal with. So, comment the whole thing to try to make it clear what's going on. In pursuit of the above, I've added return checks to some of the libunwind calls, fixed the frame loop to not skip the "top" frame (however unseful it may be), and fix a couple of calls to spl_bt_u64_to_hex_str() which requested 18 digits instead of 16. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#16653
Motivation and Context
In postmortem debugging, often all we have is the binary. Having the registers makes it a least possible to follow the disassembly and try to guess how we got here.
Description
libunwind knows how to fish all the registers out of the stack frame, not just IP. Loop over 'em and spit 'em out.
I've made no effort to do the same in the non-libunwind builds, because libc doesn't typically give you the tools to do it well/at all.
I'm not yet fully satisfied with this. I think I can do better by capturing the register state at the point the assert is tripped, but it will be a lot more invasive. I'll keep working on this, but I think this is still very useful right now.
How Has This Been Tested?
Induced a crash, enjoyed the output:
I haven't tested on non-amd64 architectures, but the libunwind documentation doesn't say anything about particular facilities not being available. At worst, I'd expect it to show nothing, or maybe only the IP.
Types of changes
Checklist:
Signed-off-by
.