-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TA unwinding on abort (print call stack) #1552
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a TA is paged and the unwind tables needs to be paged in?
core/arch/arm/kernel/abort.c
Outdated
} | ||
exidx += utc->load_addr; | ||
memset(&state, 0, sizeof(state)); | ||
state.registers[0] = r32(ai->regs->x0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct assignment should be OK, if you'd like to be very clear that we're truncating an uint64_t
into an uint32_t
a plain cast would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, initially I thought I would assert that the higher 32 bits are zero, hence a function, but actually I'm not sure it is a valid assumption (all we know is that the lower 32 bits are part of the Aarch32 state, but I suppose the higher bits may be undefined and take any value?).
I'll just make plain assignments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The higher bits are for practical purposes undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for the confirmation.
core/arch/arm/kernel/unwind_arm32.c
Outdated
start = &__exidx_start; | ||
idx_start = (vaddr_t)&__exidx_start; | ||
idx_end = (vaddr_t)&__exidx_end; | ||
if (exidx && exidx_sz) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if exidx
and exidx_sz
always where correct, that way we could avoid this special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'll change.
core/arch/arm/kernel/unwind_arm64.c
Outdated
{ | ||
uint64_t fp; | ||
|
||
fp = frame->fp; | ||
if (!thread_addr_is_in_stack(fp)) | ||
return false; | ||
if (stack && stack_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if stack
and stack_size
always where correct, that way we could avoid this special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
Good question ;) The answer is: bad things, I'm afraid. I need to test it, but I think what will happen is: abort_handler() -> get_fault_type() -> is_abort_in_abort_handler(ai) == true -> panic(). What we need is that the abort be handled as if it had occurred when running TA code, although it really occurred when the TEE core tried to access a user page. Is there a simple change we can do for this to work properly? Update is_abort_in_abort_handler() maybe? More than that? |
Doing backtrace of the stack requires running in thread context for the pager to be able to handle pager faults. We could use the technique used in |
core/arch/arm/kernel/abort.c
Outdated
} | ||
exidx += utc->load_addr; | ||
} else { | ||
exidx = (vaddr_t)&__exidx_start; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since __exidx_start
is declared as an array there's no need to use the &
also (btw, applying &
on arrays is often a bad idea).
I wonder if we shouldn't have a special .h file that contains all symbols exported by the linker script, but that's out of scope for this PR.
core/arch/arm/kernel/unwind_arm32.c
Outdated
idx_end = (vaddr_t)&__exidx_end; | ||
} | ||
start = (struct unwind_idx *)exidx; | ||
idx_start = (vaddr_t)exidx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the type of exidx
should be vaddr_t
instead to avoid this cast and the cast below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I make exidx
a vaddr_t
everywhere then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we don't have a clear policy on uaddr_t
(user space address) versus vaddr_t
(normal virtual address). When we know that an address can be used directly by OP-TEE OS then it could be represented by a vaddr_t
since all prerequisites are in place, until those checks hasn't been done a uaddr_t
should stay uaddr_t
.
find_index()
works with both pure kernel-mode pointers but also with user-mode pointers that has prerequisites in place (the correct vaspace is mapped) so this function should work with vaddr_t
only.
#endif /*ARM32*/ | ||
|
||
/* | ||
* Unwind a 32-bit user or kernel stack. When unwinding a user TA, set @exidx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the comment.
|
||
bool unwind_stack(struct unwind_state *state); | ||
/* | ||
* Unwind a 64-bit user or kernel stack. When unwinding a user TA, set @stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the comment.
core/arch/arm/kernel/abort.c
Outdated
|
||
EMSG_RAW("Call stack:"); | ||
if (abort_is_user_exception(ai)) { | ||
struct tee_ta_session *s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's in this block could go into a separate function that could be called from the other version of __print_stack_unwind_arm32()
below.
core/arch/arm/kernel/abort.c
Outdated
if (is_32bit) | ||
__print_stack_unwind_arm32(ai); | ||
else | ||
__print_stack_unwind_arm64(ai); | ||
} | ||
|
||
void abort_print(struct abort_info *ai __maybe_unused) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__maybe_unused
can be removed.
* POSSIBILITY OF SUCH DAMAGE. | ||
*/ | ||
|
||
void __aeabi_unwind_cpp_pr0(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With these there's no sense to keep the same at the end of core/arch/arm/kernel/unwind_arm32.c
How we deal with |
Thanks @jenswi-linaro. There's the issue of paged TAs, however. I don't want to introduce a situation, in which the crash of a TA could lead to a TEE core panic (even if it's only in debug mode). |
@jforissier, Sounds good, checking for |
LGTM |
scripts/symbolize.py
Outdated
|
||
def spawn_addr2line(self): | ||
if not self._addr2line: | ||
elf = self.get_elf(self._bin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C programmer error (semicolon at the end).
scripts/symbolize.py
Outdated
|
||
def spawn_addr2line(self): | ||
if not self._addr2line: | ||
elf = self.get_elf(self._bin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this line to 101 you are using leading tabs, which makes Python unhappy (you are using leading spaces in the rest of the script).
scripts/symbolize.py
Outdated
# Flatten list in case -d is used several times *and* with multiple | ||
# arguments | ||
args.dirs = [item for sublist in args.dir for item in sublist] | ||
symbolizer = Symbolizer(sys.stdout, args.dirs, args.strip_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If args.dir
is not set, the script will bail out. Either add an else
to line 166 or add a try/except
around this line.
scripts/symbolize.py
Outdated
args.dirs = [item for sublist in args.dir for item in sublist] | ||
symbolizer = Symbolizer(sys.stdout, args.dirs, args.strip_path) | ||
|
||
for line in sys.stdin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you supposed to give to stdin
? I.e, what input is the script waiting for? Core dump? Stack trace?
edit found it above:
<paste dump here>
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would probably be nice to have a print saying something: "Waiting for user to paste the dump here ...
".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be OK for interactive use but not so meaningful if the script is called in a test environment (Travis for instance).
scripts/symbolize.py
Outdated
# POSSIBILITY OF SUCH DAMAGE. | ||
# | ||
|
||
"""Symbolizes OP-TEE abort dumps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend to store this help section in a global variable and then use it as description to the argument parser, like here: https://docs.python.org/2/library/argparse.html#description
By doing so you will get the output when running the script with -h
(or wrong parameters).
@jbech-linaro thanks for reviewing the Python script. Patch updated. |
Thanks for the updates! |
There is no obvious reason for requiring the first program header in a user TA to be of type PT_LOAD. It is usually the case, due to the way our linker script is written (ta/arch/arm/ta.ld.S). Still, it may occur that other segments are inserted first by the linker. For example, when linking a 32-bit binary built with unwind tables (-funwind-tables), the first PHDR is PT_ARM_EXIDX. Such a TA won't load unless this patch is applied. Signed-off-by: Jerome Forissier <[email protected]> Reviewed-by: Jens Wiklander <[email protected]>
Update the abort handling code in the TEE core to support unwinding the user mode stack in addition to the kernel stack. unwind_arm32.c is modified slightly so that it can be built for AArch64. This allows a 64-bit TEE core to dump both 32- and 64-bit TAs. Paged TAs (CFG_PAGED_USER_TA=y) cannot currently be unwound, because the code is not ready to handle the page faults that might occur as the unwinding tables are accessed. CFG_CORE_UNWIND is renamed to CFG_UNWIND since it enables both the kernel and user TA stack dumps. It is still set automatically when CFG_TEE_CORE_DEBUG=y. 32-bit user TAs have to be compiled with `-funwind-tables`, otherwise the call stack can't be unwound and the abort reports will not show a call stack .The TA dev kit takes care of adding this flag automatically when CFG_UNWIND=y. Signed-off-by: Jerome Forissier <[email protected]> Tested-by: Jerome Forissier <[email protected]> (HiKey) Reviewed-by: Jens Wiklander <[email protected]>
Signed-off-by: Jerome Forissier <[email protected]> Reviewed-by: Jens Wiklander <[email protected]>
In the TA abort message that is sent to the console when a user-mode TA crashes, there is currently no clear indication of whether the TA was running in 32-bit or 64-bit mode. Add it since it will be useful to develop parsing tools. Signed-off-by: Jerome Forissier <[email protected]> Reviewed-by: Jens Wiklander <[email protected]>
Add a helper script to decode call stacks shown in abort messages. The script relies on addr2line to convert virtual addresses to debug information: 'function at file:line'. Signed-off-by: Jerome Forissier <[email protected]> Acked-by: Jens Wiklander <[email protected]> Reviewed-by: Joakim Bech <[email protected]>
Add TA call stack to the abort message. Here is an example:
A script is also added to help decode the stack dump: