Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELF info v4 #10

Closed
wants to merge 2 commits into from
Closed

ELF info v4 #10

wants to merge 2 commits into from

Conversation

abusque
Copy link
Contributor

@abusque abusque commented Aug 28, 2015

This patch set implements a minimal ELF parser to add extra
information necessary for callsite analysis to the statedump and dl
helper.

Also included are unit tests covering x86, x86_64, armeb, and aarch_64
targets.

Changes from version 3:

  • Handle differing values of p_align between segments when computing memsz
  • Standardize use of offset_align
  • Use defines for ELF alignment values instead of constants
  • Minor style issues

Changes from version 2:

  • Fixed coding style issues
  • More rigorous error handling
  • Field reordering in tracepoints and structs for optimization

Changes from version 1:

  • More uniform and thorough error handling (i.e. always return -1 on failure,
    0 on success)
  • Make distinction between information not found in ELF file and
    actual errors (found status is returned through an out parameter on
    function success)
  • Functions have no side-effects on failure
  • Add more cases to the unit tests for better coverage
  • Minor style issues

Implement a minimal ELF parser allowing to compute an executable's
in-memory size and to read its build id and debug link info, if any,
at runtime. This is performed during base address statedump, and on
dlopen when using the `liblttng-ust-dl.so` helper.

This adds a `memsz` field to `lttng_ust_statedump:soinfo` and
`lttng_ust_dl:dlopen`. Also adds two events to both providers,
`build_id` and `debug_link`, which are only traced when the
corresponding information is found in the executable file. They can be
matched with the corresponding `soinfo` or `dlopen` via the `baddr`
field shared amongst all the events. Build ID and debug link are
standard methods of identifying and retrieving debug information
corresponding to a specific version of an executable.

The fields `size` and `mtime` from the existing `soinfo` and `dlopen`
events have been removed as they provided no valuable information for
analysis, whereas build ID or debug link allow unambiguous retrieval
of the debug information, and the on-disk `size` is superseded by the
in-memory `memsz`.

Signed-off-by: Antoine Busque <[email protected]>
This adds unit tests for UST's ELF parser. Also included are test ELF
files for multiple architectures (x86, x86_64, armeb, aarch64_be). The
procedure to generate these test files is described in
`tests/ust-elf/README.md`.

Signed-off-by: Antoine Busque <[email protected]>
@compudj
Copy link
Contributor

compudj commented Aug 29, 2015

merged into master, thanks!

@abusque abusque closed this Aug 29, 2015
@abusque abusque deleted the elf-info-v4 branch August 31, 2015 01:47
compudj pushed a commit that referenced this pull request Dec 9, 2021
Observed issue
==============

Applications which transitively dlopen() a library which, in turn,
dlopen() providers crash when they are compiled with clang or
if LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP is defined.

  Core was generated by `././myapp.exe'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00007fa94f860bc2 in check_event_provider (probe_desc=<optimized out>) at lttng-probes.c:153
  153				if (!check_type_provider(field->type)) {
  [Current thread is 1 (Thread 0x7fa94fcbc740 (LWP 511754))]

  (gdb) bt
  #0  0x00007fa94f860bc2 in check_event_provider (probe_desc=<optimized out>) at lttng-probes.c:153
  #1  lttng_ust_probe_register (desc=0x7fa94fe9dc80 <lttng_ust__probe_desc___embedded_sys>)
      at lttng-probes.c:242
  #2  0x00007fa94fe9ba3c in lttng_ust__tracepoints__ptrs_destroy ()
      at /usr/include/lttng/tracepoint.h:590
  #3  0x00007fa94fedfe2e in call_init () from /lib64/ld-linux-x86-64.so.2
  #4  0x00007fa94fedff1c in _dl_init () from /lib64/ld-linux-x86-64.so.2
  #5  0x00007fa94fdf7d45 in _dl_catch_exception () from /usr/lib/libc.so.6
  #6  0x00007fa94fee420a in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
  #7  0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #8  0x00007fa94fee39bb in _dl_open () from /lib64/ld-linux-x86-64.so.2
  #9  0x00007fa94fe8d36c in ?? () from /usr/lib/libdl.so.2
  #10 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #11 0x00007fa94fdf7db3 in _dl_catch_error () from /usr/lib/libc.so.6
  #12 0x00007fa94fe8db99 in ?? () from /usr/lib/libdl.so.2
  #13 0x00007fa94fe8d3f8 in dlopen () from /usr/lib/libdl.so.2
  #14 0x00007fa94fecc647 in mon_constructeur () at mylib.cpp:20
  #15 0x00007fa94fedfe2e in call_init () from /lib64/ld-linux-x86-64.so.2
  #16 0x00007fa94fedff1c in _dl_init () from /lib64/ld-linux-x86-64.so.2
  #17 0x00007fa94fdf7d45 in _dl_catch_exception () from /usr/lib/libc.so.6
  #18 0x00007fa94fee420a in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
  #19 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #20 0x00007fa94fee39bb in _dl_open () from /lib64/ld-linux-x86-64.so.2
  #21 0x00007fa94fe8d36c in ?? () from /usr/lib/libdl.so.2
  #22 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #23 0x00007fa94fdf7db3 in _dl_catch_error () from /usr/lib/libc.so.6
  #24 0x00007fa94fe8db99 in ?? () from /usr/lib/libdl.so.2
  #25 0x00007fa94fe8d3f8 in dlopen () from /usr/lib/libdl.so.2
  #26 0x00005594f478c18c in main ()

Cause
=====

Building tracepoint instrumentation as C++ using clang causes
LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP to be defined due to a
compiler version detection problem addressed by another patch.

However, building with LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP
defined still results in the crash.

When LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP is defined, the
lttng_ust_event_field lttng_ust__event_fields__[...] structure is
initialized by dynamically-allocating field structures for the various
fields.

As the initialization can't be performed statically, it is performed at
run-time _after_ the execution of the library constructors has
completed.

Moreover, the generated initialization
function of the provider (lttng_ust__events_init__[...]) is declared as being a library
constructor. Hence, this causes it to run before the
tracepoint fields structures has a chance to be initialized.

This all results in a NULL pointer dereference during the validation of
the fields.

Solution
========

When building providers as C++, the initialization function is defined
as the constructor of a class. This class is, in turn, instantiated in
an anonymous namespace.

For the purposes of this patch, the use of an anonymous namespace is
equivalent to declaring the instance as 'static', but it is preferred in
C++11.

Known drawbacks
===============

None.

References
==========

A reproducer is available:
https://github.com/jgalar/ust-clang-reproducer

Problem initially reported on dotnet/runtime's issue tracker:
dotnet/runtime#62398

Relevant LTTng-UST issue:
https://bugs.lttng.org/issues/1339

Fixes: #1339
Change-Id: I51cfbe74729bd45e2613a30bc8de17e08ea8233d
Signed-off-by: Jérémie Galarneau <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
compudj pushed a commit that referenced this pull request Dec 9, 2021
Observed issue
==============

Applications which transitively dlopen() a library which, in turn,
dlopen() providers crash when they are compiled with clang or
if LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP is defined.

  Core was generated by `././myapp.exe'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00007fa94f860bc2 in check_event_provider (probe_desc=<optimized out>) at lttng-probes.c:153
  153				if (!check_type_provider(field->type)) {
  [Current thread is 1 (Thread 0x7fa94fcbc740 (LWP 511754))]

  (gdb) bt
  #0  0x00007fa94f860bc2 in check_event_provider (probe_desc=<optimized out>) at lttng-probes.c:153
  #1  lttng_ust_probe_register (desc=0x7fa94fe9dc80 <lttng_ust__probe_desc___embedded_sys>)
      at lttng-probes.c:242
  #2  0x00007fa94fe9ba3c in lttng_ust__tracepoints__ptrs_destroy ()
      at /usr/include/lttng/tracepoint.h:590
  #3  0x00007fa94fedfe2e in call_init () from /lib64/ld-linux-x86-64.so.2
  #4  0x00007fa94fedff1c in _dl_init () from /lib64/ld-linux-x86-64.so.2
  #5  0x00007fa94fdf7d45 in _dl_catch_exception () from /usr/lib/libc.so.6
  #6  0x00007fa94fee420a in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
  #7  0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #8  0x00007fa94fee39bb in _dl_open () from /lib64/ld-linux-x86-64.so.2
  #9  0x00007fa94fe8d36c in ?? () from /usr/lib/libdl.so.2
  #10 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #11 0x00007fa94fdf7db3 in _dl_catch_error () from /usr/lib/libc.so.6
  #12 0x00007fa94fe8db99 in ?? () from /usr/lib/libdl.so.2
  #13 0x00007fa94fe8d3f8 in dlopen () from /usr/lib/libdl.so.2
  #14 0x00007fa94fecc647 in mon_constructeur () at mylib.cpp:20
  #15 0x00007fa94fedfe2e in call_init () from /lib64/ld-linux-x86-64.so.2
  #16 0x00007fa94fedff1c in _dl_init () from /lib64/ld-linux-x86-64.so.2
  #17 0x00007fa94fdf7d45 in _dl_catch_exception () from /usr/lib/libc.so.6
  #18 0x00007fa94fee420a in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
  #19 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #20 0x00007fa94fee39bb in _dl_open () from /lib64/ld-linux-x86-64.so.2
  #21 0x00007fa94fe8d36c in ?? () from /usr/lib/libdl.so.2
  #22 0x00007fa94fdf7ce8 in _dl_catch_exception () from /usr/lib/libc.so.6
  #23 0x00007fa94fdf7db3 in _dl_catch_error () from /usr/lib/libc.so.6
  #24 0x00007fa94fe8db99 in ?? () from /usr/lib/libdl.so.2
  #25 0x00007fa94fe8d3f8 in dlopen () from /usr/lib/libdl.so.2
  #26 0x00005594f478c18c in main ()

Cause
=====

Building tracepoint instrumentation as C++ using clang causes
LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP to be defined due to a
compiler version detection problem addressed by another patch.

However, building with LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP
defined still results in the crash.

When LTTNG_UST_ALLOCATE_COMPOUND_LITERAL_ON_HEAP is defined, the
lttng_ust_event_field lttng_ust__event_fields__[...] structure is
initialized by dynamically-allocating field structures for the various
fields.

As the initialization can't be performed statically, it is performed at
run-time _after_ the execution of the library constructors has
completed.

Moreover, the generated initialization
function of the provider (lttng_ust__events_init__[...]) is declared as being a library
constructor. Hence, this causes it to run before the
tracepoint fields structures has a chance to be initialized.

This all results in a NULL pointer dereference during the validation of
the fields.

Solution
========

When building providers as C++, the initialization function is defined
as the constructor of a class. This class is, in turn, instantiated in
an anonymous namespace.

For the purposes of this patch, the use of an anonymous namespace is
equivalent to declaring the instance as 'static', but it is preferred in
C++11.

Known drawbacks
===============

None.

References
==========

A reproducer is available:
https://github.com/jgalar/ust-clang-reproducer

Problem initially reported on dotnet/runtime's issue tracker:
dotnet/runtime#62398

Relevant LTTng-UST issue:
https://bugs.lttng.org/issues/1339

Fixes: #1339
Change-Id: I51cfbe74729bd45e2613a30bc8de17e08ea8233d
Signed-off-by: Jérémie Galarneau <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
mjeanson pushed a commit to mjeanson/lttng-ust that referenced this pull request Oct 5, 2022
Eliminate the following deadlock by calling libc close directly when
called from within the 2.12 fd tracker locking. The 2.13 locks always
need to be taken outside of the 2.12 locks. In the scenario below,
Thread 3 (lttng-ust 2.12 listener thread) takes the locks in the wrong
order.

                Thread 3 (Thread 0x7f7679247700 (LWP 621683) "hello-ust"):
                #0  __lll_lock_wait (futex=futex@entry=0x7f767a392b60 <ust_safe_guard_fd_mutex>, private=0) at lowlevellock.c:52
                lttng#1  0x00007f767a39d843 in __GI___pthread_mutex_lock (mutex=0x7f767a392b60 <ust_safe_guard_fd_mutex>) at ../nptl/pthread_mutex_lock.c:80
-> 2.13 locks fd tracker
2.13            lttng#2  0x00007f767a37ff82 in lttng_ust_lock_fd_tracker_orig () at fd-tracker.c:163
2.13            lttng#3  0x00007f767a380b66 in lttng_ust_safe_close_fd_orig (fd=3, close_cb=0x7f767a56b070 <__GI___close>) at fd-tracker.c:385
2.13            lttng#4  0x00007f767a6e5557 in close (fd=3) at lttng-ust-fd.c:101
2.12            lttng#5  0x00007f767a297e7d in ustcomm_connect_unix_sock (pathname=0x7f767a32bc64 <local_apps+36> "/home/compudj/.lttng/lttng-ust-sock-8", timeout=3000) at lttng-ust-comm.c:157
-> 2.12 listener thread locks fd tracker
-> 2.12 listener thread locks ust_lock
2.12            lttng#6  0x00007f767a2a1f77 in ust_listener_thread (arg=0x7f767a32bc40 <local_apps>) at lttng-ust-comm.c:1591
                lttng#7  0x00007f767a39aea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
                lttng#8  0x00007f767a57aaef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

                Thread 2 (Thread 0x7f7678a46700 (LWP 621682) "hello-ust"):
                #0  __lll_lock_wait (futex=futex@entry=0x7f767a32f780 <ust_mutex>, private=0) at lowlevellock.c:52
                lttng#1  0x00007f767a39d843 in __GI___pthread_mutex_lock (mutex=0x7f767a32f780 <ust_mutex>) at ../nptl/pthread_mutex_lock.c:80
2.12            lttng#2  0x00007f767a29da59 in ust_lock () at lttng-ust-comm.c:167
2.12            lttng#3  0x00007f767a2a1d95 in ust_listener_thread (arg=0x7f767a329be0 <global_apps>) at lttng-ust-comm.c:1558
                lttng#4  0x00007f767a39aea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
                lttng#5  0x00007f767a57aaef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

                Thread 1 (Thread 0x7f767a33a040 (LWP 621681) "hello"):
                #0  __lll_lock_wait (futex=futex@entry=0x7f767a32f720 <ust_safe_guard_fd_mutex>, private=0) at lowlevellock.c:52
                lttng#1  0x00007f767a39d843 in __GI___pthread_mutex_lock (mutex=0x7f767a32f720 <ust_safe_guard_fd_mutex>) at ../nptl/pthread_mutex_lock.c:80
-> 2.12 lock fd tracker
2.12            lttng#2  0x00007f767a29c37c in lttng_ust_lock_fd_tracker () at lttng-ust-fd-tracker.c:153
2.12            lttng#3  0x00007f767a29ceba in lttng_ust_safe_close_fd (fd=4, close_cb=0x7f767a56b070 <__GI___close>) at lttng-ust-fd-tracker.c:341
2.13            lttng#4  0x00007f767a380b06 in lttng_ust_safe_close_fd_chain (fd=4, close_cb=0x7f767a56b070 <__GI___close>) at fd-tracker.c:348
2.13            lttng#5  0x00007f767a380b5c in lttng_ust_safe_close_fd_orig (fd=4, close_cb=0x7f767a56b070 <__GI___close>) at fd-tracker.c:379
2.13            lttng#6  0x00007f767a6e5557 in close (fd=4) at lttng-ust-fd.c:101
-> 2.13 lock fd tracker
2.13            lttng#7  0x00007f767a43b5e6 in lttng_ust_elf_destroy (elf=0x55b870f044c0) at elf.c:352
2.13            lttng#8  0x00007f767a3f2797 in get_elf_info (bin_data=0x7ffd5481ef70) at lttng-ust-statedump.c:296
2.13            lttng#9  0x00007f767a3f283f in extract_baddr (bin_data=0x7ffd5481ef70) at lttng-ust-statedump.c:319
2.13            lttng#10 0x00007f767a3f3173 in extract_bin_info_events (info=0x7ffd5481fff0, size=64, _data=0x7ffd54820098) at lttng-ust-statedump.c:518
                lttng#11 0x00007f767a5b3ad5 in __GI___dl_iterate_phdr (callback=0x7f767a3f2f5a <extract_bin_info_events>, data=0x7ffd54820098) at dl-iteratephdr.c:75
2.13            lttng#12 0x00007f767a3f32c5 in lttng_ust_dl_update_orig (ip=0x7f767a3cb599 <lttng_ust_ctor+683>) at lttng-ust-statedump.c:574
2.13            lttng#13 0x00007f767a3f33f7 in lttng_ust_statedump_init () at lttng-ust-statedump.c:638
2.13            lttng#14 0x00007f767a3cb599 in lttng_ust_ctor () at lttng-ust-comm.c:2246
2.13            lttng#15 0x00007f767a3ccaad in lttng_ust_after_fork_child (restore_sigset=0x7ffd548207e0) at lttng-ust-comm.c:2577
2.13            lttng#16 0x00007f767a6f687f in fork () at ustfork.c:271
                lttng#17 0x000055b8705f1034 in make_child ()
                lttng#18 0x000055b8705daacc in execute_command_internal ()
                lttng#19 0x000055b8705ddcf5 in execute_command ()
                lttng#20 0x000055b8705df951 in ?? ()
                lttng#21 0x000055b8705daf79 in execute_command_internal ()
                lttng#22 0x000055b8705ddcf5 in execute_command ()
                lttng#23 0x000055b8705c49db in reader_loop ()
                lttng#24 0x000055b8705c3668 in main ()

Getting the state of lttng-ust 2.12 ust_fd_mutex_nest TLS variable is
not straightforward, because it is a static variable (internal state).

The trick used here is to rely on the "lttng_ust_safe_close_fd" symbol
exported by lttng-ust 2.12. When passed fd=-1, with a close_cb which
simply returns 0, we can use the resulting return values to determine
the current state (zero vs nonzero) of the ust_fd_mutex_nest TLS
variable without any side-effect other than temporarily taking/releasing
the fd tracker lock when ust_fd_mutex_nest==0.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Change-Id: I6b95e0fd607abccd67aadf35ec96c19877394b16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants