Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Sanitizers support. #31605

Closed
wants to merge 3 commits into from
Closed

[WIP] Sanitizers support. #31605

wants to merge 3 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Feb 12, 2016

I was recently working on support for thread sanitizer and leak sanitizer in
rustc. This is still work in progress, so all kinds of comments, suggestions
and help is welcomed.

Leak sanitizer intercepts malloc and friends, records allocations (including
program backtrace at allocation time) and at program exit reports all leaks (in
atexit callback). Enabling it basically requires linking in lsan runtime
library when building final executable.

Supporting thread sanitizer is a little bit more complicated. Thread sanitizer
combines function interception with additional LLVM pass that instruments
atomic intrinsics, memory intrinsics (memcpy, ...), and memory operations,
replacing them with calls to tsan runtime. Full instrumentation is performed
only for functions with LLVM SanitizeThread attribute.

Address sanitizer and memory sanitizers are also included, but I am yet to
analze how clang and llvm supports them, so this could be quite broken on
anything but trivial programs.

Current state and things that are still left to be done:

  • Makefile support is missing. Sanitiers runtime libraries are built only when
    using rustbuild.
  • compiletest support is missing. It is not possible to skip execution of
    sanitizers tests when running on unsupported platform or if sanitizers have
    not been built at all. This blocked me from including tests dedicated for
    sanitizers.
  • Sanitizers intercept calls to standard memory allocation routines, so it does
    not work with liballoc_jemalloc (i.e., sanitizer is not aware of allocations
    done through jemalloc). I tried to alleviate this problem by making
    liballoc_system default allocator when used with sanitizers.
  • Thread sanitizer does not support unwinding. In particular it emits
    additional call on function entry and function exit to maintain so called
    shadow stack. During unwinding exit callback function is not called which
    leaves shadow stack inconsistent, or could even lead to shadow stack
    overflow. Though, it seems that it could be supported in future:
    TSAN does not support C++ exceptions? google/sanitizers#485.
  • Thread sanitizer does not support standalone memory fences which are used in
    current Arc implementation, and reports false positives. See following for
    more details:
    https://groups.google.com/d/topic/thread-sanitizer/B4i9EMQ4BQE/discussion
    https://groups.google.com/d/topic/thread-sanitizer/dZ3QiQE49ns/discussion
    Replacing fences in Arc by AcqRel on fetch_sub is one possible workaround
    (this is for example implementation used in boost:
    sp_counted_base_std_atomic.hpp, sp_counted_base_clang.hpp).
  • To work correctly everything that goes into final executable should be
    compiled with the same sanitizer flags. Unfortunately it is quite easy to get
    this wrong.

Two small examples how this works. First leak sanitizer:

$ cat l.rs 
fn main() {
    let mut v =  Vec::new();
    for i in 1..10 { v.push(i); }
    std::mem::forget(v);
}
$ rustc l.rs -C sanitize=leak -g
$ ./l 

=================================================================
==25162==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x55c831f200f9 in realloc /home/tm/dev/rust-sanitize/src/compiler-rt/lib/lsan/lsan_interceptors.cc:83
    #1 0x55c831f3c882 in heap::reallocate::h821891df4ae7e3e2cea /home/tm/dev/rust-sanitize/src/liballoc/heap.rs:79
    #2 0x55c831f3c6de in raw_vec::RawVec$LT$T$GT$::double::h15624233122824710853 /home/tm/dev/rust-sanitize/src/liballoc/raw_vec.rs:225
    #3 0x55c831f3c48d in vec::Vec$LT$T$GT$::push::h5050575373468619909 /home/tm/dev/rust-sanitize/src/libcollections/vec.rs:662
    #4 0x55c831f3b977 in main::h20f3bbfa5731dcbceaa /home/tm/dev/rust-sanitize/l.rs:3
    #5 0x55c831f42a49 in fnfn /home/tm/dev/rust-sanitize/src/libstd/panic.rs:260
    #6 0x55c831f42a49 in sys_common::unwind::try::try_fn::h6538920744052587894 /home/tm/dev/rust-sanitize/src/libstd/sys/common/unwind/mod.rs:127
    #7 0x55c831f403cb in __rust_try (/home/tm/dev/rust-sanitize/l+0x293cb)
    #8 0x55c831f3c9d9 in main (/home/tm/dev/rust-sanitize/l+0x259d9)

SUMMARY: LeakSanitizer: 64 byte(s) leaked in 1 allocation(s).

And thread sanitizer:

$ cat r.rs 
use std::thread;

static mut A : isize = 0;

fn main() {
    let t1 = thread::spawn(|| unsafe {
        A = 1;
    });
    let t2 = thread::spawn(|| unsafe {
        A = 2;
    });

    t1.join().unwrap();
    t2.join().unwrap();
}

$ rustc r.rs -C sanitize=thread -g
$ ./r 
==================
WARNING: ThreadSanitizer: data race (pid=25023)
  Write of size 8 at 0x0000014b4428 by thread T2:
    #0 main::_$u7b$$u7b$closure$u7d$$u7d$::closure.4792 /home/tm/dev/rust-sanitize/r.rs:10 (r+0x00000048d946)
    #1 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::closure.4788 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:277 (r+0x00000048d69e)
    #2 sys_common::unwind::try::try_fn::h9274523788145261467 /home/tm/dev/rust-sanitize/src/libstd/sys/common/unwind/mod.rs:127 (r+0x00000048d65a)
    #3 __rust_try <null> (r+0x0000004937bb)
    #4 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::closure.4785 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:277 (r+0x00000048d41c)
    #5 boxed::F.FnBox$LT$A$GT$::call_box::h16235982475798004568 /home/tm/dev/rust-sanitize/src/liballoc/boxed.rs:541 (r+0x00000048dc18)
    #6 sys::thread::Thread::new::thread_start::he0c01c02f3160f11sWx <null> (r+0x00000049539f)

  Previous write of size 8 at 0x0000014b4428 by thread T1:
    #0 main::_$u7b$$u7b$closure$u7d$$u7d$::closure.4752 /home/tm/dev/rust-sanitize/r.rs:7 (r+0x00000048c2e6)
    #1 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::closure.4656 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:277 (r+0x00000048c03e)
    #2 sys_common::unwind::try::try_fn::h17278694213563056245 /home/tm/dev/rust-sanitize/src/libstd/sys/common/unwind/mod.rs:127 (r+0x00000048bffa)
    #3 __rust_try <null> (r+0x0000004937bb)
    #4 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::closure.4628 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:277 (r+0x00000048bdbc)
    #5 boxed::F.FnBox$LT$A$GT$::call_box::h2072261010643383919 /home/tm/dev/rust-sanitize/src/liballoc/boxed.rs:541 (r+0x00000048c6d8)
    #6 sys::thread::Thread::new::thread_start::he0c01c02f3160f11sWx <null> (r+0x00000049539f)

  Location is global 'A::h8f386a2750644a91faa' of size 8 at 0x0000014b4428 (r+0x0000014b4428)

  Thread T2 (tid=25026, running) created by main thread at:
    #0 pthread_create /home/tm/dev/rust-sanitize/src/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:885 (r+0x00000040e5c6)
    #1 sys::thread::Thread::new::hd809cc2c968844a5ZOx /home/tm/dev/rust-sanitize/src/libstd/sys/unix/thread.rs:62 (r+0x0000004915f2)
    #2 thread::spawn::h12587892448715582896 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:317 (r+0x00000048c8e4)
    #3 main::hec4c6925d3037a95iaa /home/tm/dev/rust-sanitize/r.rs:9 (r+0x000000487d2d)
    #4 fnfn /home/tm/dev/rust-sanitize/src/libstd/panic.rs:260 (r+0x0000004960f9)
    #5 sys_common::unwind::try::try_fn::h6538920744052587894 /home/tm/dev/rust-sanitize/src/libstd/sys/common/unwind/mod.rs:127 (r+0x0000004960f9)
    #6 __libc_start_main <null> (libc.so.6+0x00000002086f)

  Thread T1 (tid=25025, finished) created by main thread at:
    #0 pthread_create /home/tm/dev/rust-sanitize/src/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:885 (r+0x00000040e5c6)
    #1 sys::thread::Thread::new::hd809cc2c968844a5ZOx /home/tm/dev/rust-sanitize/src/libstd/sys/unix/thread.rs:62 (r+0x0000004915f2)
    #2 thread::spawn::h8606048922863295027 /home/tm/dev/rust-sanitize/src/libstd/thread/mod.rs:317 (r+0x000000487eb4)
    #3 main::hec4c6925d3037a95iaa /home/tm/dev/rust-sanitize/r.rs:6 (r+0x000000487d24)
    #4 fnfn /home/tm/dev/rust-sanitize/src/libstd/panic.rs:260 (r+0x0000004960f9)
    #5 sys_common::unwind::try::try_fn::h6538920744052587894 /home/tm/dev/rust-sanitize/src/libstd/sys/common/unwind/mod.rs:127 (r+0x0000004960f9)
    #6 __libc_start_main <null> (libc.so.6+0x00000002086f)

SUMMARY: ThreadSanitizer: data race /home/tm/dev/rust-sanitize/r.rs:10 in main::_$u7b$$u7b$closure$u7d$$u7d$::closure.4792
==================
ThreadSanitizer: reported 1 warnings

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Aatch (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@brson
Copy link
Contributor

brson commented Feb 12, 2016

To work correctly everything that goes into final executable should be
compiled with the same sanitizer flags. Unfortunately it is quite easy to get
this wrong.

@alexcrichton Another case for additional standard library bins, and custom built std.

@brson
Copy link
Contributor

brson commented Feb 12, 2016

This is so good.

lgtm. r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned Aatch Feb 12, 2016
@alexcrichton
Copy link
Member

This is also quite exciting for me to see as well, thanks for taking this on @tmiasko! I've long wanted to see these for Rust :)

@@ -558,9 +584,10 @@ options! {CodegenOptions, CodegenSetter, basic_codegen_options,
"explicitly enable the cfg(debug_assertions) directive"),
inline_threshold: Option<usize> = (None, parse_opt_uint,
"set the inlining threshold for"),
sanitize: Option<Sanitize> = (None, parse_sanitize,
"choose the sanitizer to use"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I think it's fair to say that this is a relatively unstable option, so could this be behind the -Z flags instead of -C? Once we get more experience with it we may be able to promote it to -C

@alexcrichton
Copy link
Member

Reading this all it looks pretty good to me, my only point of hesitation might be the modifications to the linker backend. There's a few pieces of logic which make me uneasy:

  • Using a sanitizer forces usage of alloc_system
  • Using a sanitizer links in some extra C runtime code
  • Sanitizers have their own C dependencies

All of these are currently handled in the compiler, but I could also imagine a crate that looks like:

#![crate_name = "asan"]
#![cfg(sanitize = "address")] // crate is a noop without address sanitizer

extern crate alloc_system; // this only works if we use the system allocator

#[link(name = "rustc_asan", kind = "static")] // runtime support library
#[link(name = "pthread")] // native deps
#[link(name = "c")]
extern {}

// Any Rust-provided support necessary for this as well
#[no_mangle]
pub extern fn asan_runtime_foo() {}

Basically, using a crate for this would mean that all the linker pieces are taken care of, the alloc_system part is taken care of, and that's almost it! In the long run this also means that we can use Cargo to build this crate, and perhaps move the compiler-rt bits over into a build.rs instead of the build system itself. It's already a little unfortunate that compiler-rt is built as part of the build system and not as part of libcore.

In terms of usage, we could detect -Z sanitize=foo and basically inject linking to extern crate foo to pull in all the runtime support, or we could require an extern crate foo annotation in the crate itself.


Now all that being said, I could also see the argument that the compiler is injecting calls to this runtime support, so it should be the one providing this code in a more intrusive manner (similarly to how compiler-rt works today). Curious what you think, though? Does an organizational unit of a crate make sense?

@alexcrichton
Copy link
Member

Oh, some other thoughts:

  • Do any sanitization passes require instrumentation on behalf of rustc itself? For example should we be modifying codegen in one form or another? I seem to recall in the past that, for example, the most bang for our buck of the thread sanitizer happened if we did this, but we may get some nice benefits if we don't bother!
  • Depending on the failure mode of linking instrumented code to non-instrumented code, the compiler could emit this kind of info into the metadata to ensure it's either propagated or as a verification that all code is compiled correctly.

@alexcrichton
Copy link
Member

cc @japaric

if you're interested in moving compiler-rt to its own crate, then this may impact that as well

@ghost
Copy link
Author

ghost commented Feb 13, 2016

Regarding question of using separate crate, I think that would make sense. At
the same time, the link attribute alone does not seem to be enough to do this.
Crate could take over responsibility for inject alloc_system, but linking with
additional runtime libraries will not work without --no-as-needed (as long as
we use --as-needed flag by default, or someone else specifies it).

@ghost
Copy link
Author

ghost commented Feb 13, 2016

Leak sanitizer does not require any instrumentation at all. For thread
sanitizer, as far as I understand it, the LLVM instrumentation pass is all that
is necessary. On the other hand, clang does additional instrumentation for
others. For example clang poisons memory of object after destructor have been
run for memory sanitizer. For address sanitizer clang may also emit additional
padding.

Diagnosing mixing of instrumented code and non-instrumented one would be great!

@brson
Copy link
Contributor

brson commented Feb 13, 2016

Using a crate to handle the linker configuration is a cool idea, but it does require more things to configure - the compiler doesn't control everything, now you've got to link to something too. In the end it needs to be simple to activate both the compiler support and library support at once, on demand. @alexcrichton how would you see people activating the sanitizers?

@alexcrichton
Copy link
Member

I'd probably see one of two routes to be taken here:

  1. what's here is implemented
  2. Otherwise when the compiler sees -Z sanitize=address it injects a dependency from all crates compiled with asan support to the rustc_asan crate (which we ship). This is similar to how allocators work today.

@tmiasko could you elaborate on the --as-needed errors you were seeing? So long as rustc understands the dependency graph (e.g. the edges are appropriately drawn) then the linker arguments should always end up in the right order.

@ghost
Copy link
Author

ghost commented Feb 14, 2016

The general problem with as-needed is described here:
https://llvm.org/bugs/show_bug.cgi?id=15823

Example code uses clock_gettime which is defined in librt. This function is
provided by sanitizer runtime library to be intercepted. Linker finds it
in sanitizer lib, considers librt unused and does not link it in. At runtime
sanitizer uses dlsym with RTLD_NEXT to locate the real clock_gettime and fails.

Currently I can't reproduce this on my system, because glibc clock_* functions
are now available directly from libc. Though this or similar problem could
potentially still occur in a different setup.

I think we can try with crates, because this seems a little less invasive on
linking process and solves allocator problem quite neatly. And if it fails in
practice reconsider approach from this PR.

@ghost
Copy link
Author

ghost commented Feb 14, 2016

I rebased the code and addressed all comments but those regarding using crates.

Using crates turned out to be a little more involved than I thought initially.
In particular if those sanitizer crates where to depend on alloc_system, it
would be incorrect to just inject dependency to all crates compiled with
particular sanitizer. It would create cycles. It seems that similar machinery
to one already used for allocators would be needed to avoid those.

Alternatively injecting two kinds of dependencies should be done separately:

  • injecting alloc_system
  • injecting dependency on sanitizer crate (which does not depend on allocator)

@bors
Copy link
Contributor

bors commented Feb 20, 2016

☔ The latest upstream changes (presumably #31474) made this pull request unmergeable. Please resolve the merge conflicts.

@@ -636,6 +636,23 @@ pub fn run_passes(sess: &Session,
let mut modules_config = ModuleConfig::new(tm, sess.opts.cg.passes.clone());
let mut metadata_config = ModuleConfig::new(tm, vec!());

sess.opts.debugging_opts.sanitize.map(|s| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of style, this is typically expressed as:

if let Some(s) = sess.oopts.debuggings_opts.sanitize {
    // ...
}

@alexcrichton
Copy link
Member

Thanks @tmiasko!

it would be incorrect to just inject dependency to all crates compiled with particular sanitizer. It would create cycles.

This is assuming that we compile alloc_system itself with asan support, right? In theory that's a bit backwards, right? In theory the allocator itself can't really be compiled with asan support?

It may also just be possible to ensure that the sanitizer crate shows up at the end of the topological ordering to ensure it shows up at the end.

For the --as-needed support as well, we may want to just have smarter support for disabling it. If the sanitizers fundamentally don't support --as-needed then we can modify the linkers a bit to move the flag to a separate location which is only passed for some platforms in some situations. Does that make sense?

@ghost
Copy link
Author

ghost commented Feb 22, 2016

Yes, I was assuming that alloc_system would be compiled with sanitizer support
and instrumented. This is also how I used this so far, i.e., by instrumenting
everything (including allocators).

But we have to keep in mind that alloc_system is merely an interface to system
allocator, so we don't instrument implementation of malloc, but only a few
functions that call malloc. Otherwise as you say, it would be problematic.

The --as-needed option is a limitation, but only with respect to runtime
dependencies of sanitizers. In general, avoiding linking-in unused dependencies
is fine.

With respect to this PR, do expect me to make some additional changes (apart
from one mentioned inline, and making it merge-able of course)? Not sure what
is your take on this.

@alexcrichton
Copy link
Member

I guess I'm a little unclear on what the next steps for the PR is if it lands. Do we want to start moving towards stabilizing the -Z sanitize=foo flags? Do we want to start shipping nightlies with the sanitizers built?

To me the best way forward would be to land support in the compiler but through a mechanism that's as minimally invasive as possible. Along those lines I'd expect this to modify the pass manager we give to LLVM with some additional passes, but not much else. The changes to allocation crates as well as the addition of runtime dependencies/linker flags seems... unfortunate?

Overall I think that it'd probably be best to figure out a story for sanitizers in general before landing this. I can definitely see us starting to ship support built-in, but there may be bunch of other factors we need to take into account before doing that.

@ghost
Copy link
Author

ghost commented Mar 10, 2016

Ok, I am closing this for now. If at some point in a future cargo will support
rebuilding std crate with custom options we can revisit it then. Otherwise
using sanitizers is too inconvenient.

@ghost ghost closed this Mar 10, 2016
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants