Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid deadlock by skipping sampling in libc, libgcc and pthread #85

Merged
merged 11 commits into from
Nov 1, 2021
2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ nix = "0.23"
parking_lot = "0.11"
tempfile = "3.1"
thiserror = "1.0"
findshlibs = "0.10"
cfg-if = "1.0"

inferno = { version = "0.10", default-features = false, features = ["nameattr"], optional = true }
prost = { version = "0.9", optional = true }
Expand Down
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,13 @@ FRAME: backtrace::backtrace::trace::h3e91a3123a3049a5 -> FRAME: pprof::profiler:
FRAME: backtrace::backtrace::trace::h3e91a3123a3049a5 -> FRAME: pprof::profiler::perf_signal_handler::h7b995c4ab2e66493 -> FRAME: Unknown -> FRAME: prime_number::main::h47f1058543990c8b -> FRAME: std::rt::lang_start::{{closure}}::h4262e250f8024b06 -> FRAME: std::rt::lang_start_internal::{{closure}}::h812f70926ebbddd0 -> std::panicking::try::do_call::h3210e2ce6a68897b -> FRAME: __rust_maybe_catch_panic -> FRAME: std::panicking::try::h28c2e2ec1c3871ce -> std::panic::catch_unwind::h05e542185e35aabf -> std::rt::lang_start_internal::hd7efcfd33686f472 -> FRAME: main -> FRAME: __libc_start_main -> FRAME: _start -> FRAME: Unknown -> THREAD: prime_number 1
```


## Features

- `cpp` enables the cpp demangle.
- `flamegraph` enables the flamegraph report format.
- `protobuf` enables the pprof protobuf report format.

## Flamegraph

```toml
Expand Down Expand Up @@ -206,6 +213,12 @@ Unfortunately, there is no 100% robust stack tracing method. [Some related resea

> libgcc's unwind method is not safe to use from signal handlers. One particular cause of deadlock is when profiling tick happens when program is propagating thrown exception.

This can be resolved by adding a blocklist:

```rust
let guard = pprof::ProfilerGuardBuilder::default().frequency(1000).blocklist(&["libc", "libgcc", "pthread"]).build().unwrap();
```

### Signal Safety

Signal safety is hard to guarantee. But it's not *that* hard.
Expand Down
34 changes: 34 additions & 0 deletions examples/backtrace_while_sampling.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// Copyright 2021 TiKV Project Authors. Licensed under Apache-2.0.

use pprof;
use std::fs::File;

fn deep_recursive(depth: i32) {
if depth > 0 {
deep_recursive(depth - 1);
} else {
backtrace::Backtrace::new();
}
}

fn main() {
let guard = pprof::ProfilerGuardBuilder::default()
.frequency(1000)
.blocklist(&["libc", "libgcc", "pthread"])
.build()
.unwrap();

for _ in 0..10000 {
deep_recursive(20);
}

match guard.report().build() {
Ok(report) => {
let file = File::create("flamegraph.svg").unwrap();
report.flamegraph(file).unwrap();

println!("report: {:?}", &report);
}
Err(_) => {}
};
}
1 change: 1 addition & 0 deletions examples/malloc_hook.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ use std::ffi::c_void;

#[cfg(not(target_os = "linux"))]
#[allow(clippy::wrong_self_convention)]
#[allow(non_upper_case_globals)]
static mut __malloc_hook: Option<extern "C" fn(size: usize) -> *mut c_void> = None;

extern "C" {
Expand Down
45 changes: 27 additions & 18 deletions src/collector.rs
Original file line number Diff line number Diff line change
Expand Up @@ -247,12 +247,25 @@ impl<T: Hash + Eq + 'static> Collector<T> {
}
}

#[cfg(test)]
mod test_utils {
use super::*;
use std::collections::BTreeMap;

pub fn add_map(hashmap: &mut BTreeMap<usize, isize>, entry: &Entry<usize>) {
match hashmap.get_mut(&entry.item) {
None => {
hashmap.insert(entry.item, entry.count);
}
Some(count) => *count += entry.count,
}
}
}

#[cfg(test)]
mod tests {
use super::*;
use std::cell::RefCell;
use std::collections::BTreeMap;
use std::ffi::c_void;

#[test]
fn stack_hash_counter() {
Expand All @@ -272,15 +285,6 @@ mod tests {
});
}

fn add_map(hashmap: &mut BTreeMap<usize, isize>, entry: &Entry<usize>) {
match hashmap.get_mut(&entry.item) {
None => {
hashmap.insert(entry.item, entry.count);
}
Some(count) => *count += entry.count,
}
}

#[test]
fn evict_test() {
let mut stack_hash_counter = StackHashCounter::<usize>::default();
Expand All @@ -291,14 +295,14 @@ mod tests {
match stack_hash_counter.add(item, 1) {
None => {}
Some(evict) => {
add_map(&mut real_map, &evict);
test_utils::add_map(&mut real_map, &evict);
}
}
}
}

stack_hash_counter.iter().for_each(|entry| {
add_map(&mut real_map, &entry);
test_utils::add_map(&mut real_map, &entry);
});

for item in 0..(1 << 10) * 4 {
Expand Down Expand Up @@ -326,7 +330,7 @@ mod tests {
}

collector.try_iter().unwrap().for_each(|entry| {
add_map(&mut real_map, &entry);
test_utils::add_map(&mut real_map, &entry);
});

for item in 0..(1 << 12) * 4 {
Expand All @@ -341,10 +345,15 @@ mod tests {
}
}
}
}

#[cfg(not(target_os = "linux"))]
#[allow(clippy::wrong_self_convention)]
static mut __malloc_hook: Option<extern "C" fn(size: usize) -> *mut c_void> = None;
#[cfg(test)]
#[cfg(target_os = "linux")]
mod malloc_free_test {
use super::*;
use std::cell::RefCell;
use std::collections::BTreeMap;
use std::ffi::c_void;

extern "C" {
#[cfg(target_os = "linux")]
Expand Down Expand Up @@ -397,7 +406,7 @@ mod tests {
});

collector.try_iter().unwrap().for_each(|entry| {
add_map(&mut real_map, &entry);
test_utils::add_map(&mut real_map, &entry);
});

for item in 0..(1 << 10) * 4 {
Expand Down
25 changes: 20 additions & 5 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

//! pprof-rs is an integrated profiler for rust program.
//!
//! This crate provides a programable interface to start/stop/report a profiler dynamically. With the
//! help of this crate, you can easily integrate a profiler into your rust program in a modern, convenient
//! way.
//! This crate provides a programable interface to start/stop/report a profiler
//! dynamically. With the help of this crate, you can easily integrate a
//! profiler into your rust program in a modern, convenient way.
//!
//! A sample usage is:
//!
Expand All @@ -21,7 +21,22 @@
//!};
//! ```
//!
//! You can find more details in [README.md](https://github.com/tikv/pprof-rs/blob/master/README.md)
//! More configuration can be passed through `ProfilerGuardBuilder`:
//!
//! ```rust
//! let guard = pprof::ProfilerGuardBuilder::default().frequency(1000).blocklist(&["libc", "libgcc", "pthread"]).build().unwrap();
//! ```
//!
//! The frequency means the sampler frequency, and the `blocklist` means the
//! profiler will ignore the sample whose first frame is from library containing
//! these strings.
//!
//! Skipping `libc`, `libgcc` and `libpthread` could be a solution to the
//! possible deadlock inside the `_Unwind_Backtrace`, and keep the signal
//! safety.
//!
//! You can find more details in
//! [README.md](https://github.com/tikv/pprof-rs/blob/master/README.md)

/// Define the MAX supported stack depth. TODO: make this variable mutable.
pub const MAX_DEPTH: usize = 32;
Expand All @@ -40,7 +55,7 @@ mod timer;
pub use self::collector::{Collector, StackHashCounter};
pub use self::error::{Error, Result};
pub use self::frames::{Frames, Symbol};
pub use self::profiler::ProfilerGuard;
pub use self::profiler::{ProfilerGuard, ProfilerGuardBuilder};
pub use self::report::{Report, ReportBuilder};

#[cfg(feature = "flamegraph")]
Expand Down
Loading