Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepend signal number and thread ID on backtraces #54380

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

d-netto
Copy link
Member

@d-netto d-netto commented May 6, 2024

These extra information on the backtraces proved useful for us when filtering which servers failed due to a segmentation fault, for instance, and when diagnosing crashes in production more in general.

@d-netto d-netto requested a review from kpamnany May 6, 2024 18:00
@d-netto d-netto force-pushed the dcn-prepend-sigid-and-tid-on-backtraces branch 2 times, most recently from 62870e7 to 9230640 Compare May 6, 2024 18:12
src/stackwalk.c Outdated Show resolved Hide resolved
src/stackwalk.c Outdated Show resolved Hide resolved
@d-netto d-netto force-pushed the dcn-prepend-sigid-and-tid-on-backtraces branch from fa1bf20 to 37b4410 Compare May 6, 2024 20:58
@d-netto d-netto force-pushed the dcn-prepend-sigid-and-tid-on-backtraces branch from 37b4410 to 1a0a83d Compare May 6, 2024 20:59
}
// do not call jl_threadid if there's no current task
if (jl_get_current_task()) {
ctid = jl_threadid() + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called from jlbacktracet(jl_task_t *t) which may be a task that is not on the current thread.
See jl_print_task_backtraces.

For profiling we store this information as metadata on the backtrace.

ctid = jl_threadid() + 1;
}
snprintf(pre_str, 64, "thread (%d) ", ctid);
jl_print_native_codeloc(pre_str, (uintptr_t)ip);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be more confusing than helpful. It is not guaranteed that the ip is currently executed on the current thread?

@vchuravy
Copy link
Member

vchuravy commented May 6, 2024

Can you show some examples where this is useful and how the format ended up?
Would it be possible to add tests?

@kpamnany
Copy link
Contributor

kpamnany commented May 7, 2024

This is how we're currently solving #51056. You can see what it looks like in #51659.

The thread ID prefix is simply to help disambiguate stack traces if they're reported concurrently. It is not informational, i.e. it isn't intended to be the thread ID on which the stack trace was gathered (which is not especially useful information anyway).

Jameson suggested using the task pointer instead but I feel that is even less useful.

@kpamnany
Copy link
Contributor

kpamnany commented May 7, 2024

not especially useful information anyway).

Now that I think about it, it would be useful information. We'll look into #41742, thanks for the pointer!

@tecosaur tecosaur added display and printing Aesthetics and correctness of printed representations of objects. error messages Better, more actionable error messages labels May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
display and printing Aesthetics and correctness of printed representations of objects. error messages Better, more actionable error messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants