Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

liballoc_system without #[global_allocator] uses jemalloc #45966

Closed
SimonSapin opened this issue Nov 13, 2017 · 17 comments
Closed

liballoc_system without #[global_allocator] uses jemalloc #45966

SimonSapin opened this issue Nov 13, 2017 · 17 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@SimonSapin
Copy link
Contributor

This was found by @RalfJung in #45955.

alloc_system::System::alloc calls libc::malloc which is defined as:

extern {
    pub fn malloc(size: size_t) -> *mut c_void;
}

When #[global_allocator] is not used, the current default for executables is alloc_jemalloc, which links jemalloc as configured in src/liballoc_jemalloc/build.rs. On most platforms, this is without --with-jemalloc-prefix, which causes jemalloc to define an unprefixed malloc symbol that "overrides" libc’s and ends up being used by alloc_system.

So alloc_system doesn’t do what the name suggests, in this situation.

Like https://github.com/alexcrichton/jemallocator/issues/19 this problem will disappear when alloc_jemalloc is eventually removed, but in the meantime alloc_system doesn’t always do what the name says it does.

We stopped prefixing jemalloc symbols in #31460 in order to make LLVM use them. Could we perhaps only do this when compiling a compiler? Or perhaps the compiler could switch to using the jemallocator crate, which would gain a Cargo feature flag to disable prefixing? It would do something like that anyway to keep that LLVM+jemalloc benefit when alloc_jemalloc is removed and the default for executables is changed to alloc_system.

CC @alexcrichton

@SimonSapin
Copy link
Contributor Author

@alexcrichton wrote #46117 (comment)

Unfortunately I think it's basically just impossible to fix System.alloc calling jemalloc because of linker trickery. Our only recourse is to remove jemalloc.

Do you mean impossible while also making LLVM use jemalloc? If we revert (relevant parts of) #31460 and go back to always compiling jemalloc with --with-prefix=something, wouldn’t that fix this issue? What’s the plan for LLVM once the alloc_jemalloc crate is removed?

@sfackler
Copy link
Member

Prefixing jemalloc symbols would resolve this issue, but I would argue that would be a net negative. It's not great to have two competing allocators both running at the same time in a single process.

As long as alloc_jemalloc/jemallocator/whatever is configured to have non-prefixed symbols, LLVM will continue to use it AFAIK.

@alexcrichton
Copy link
Member

@SimonSapin ah yeah sorry what I meant was we could indeed prefix the symbols but the intention is to get LLVM to use jemalloc (as it makes it ~10% faster historically). I think our long term plan is to remove jemalloc from libtsd but leave it in the compiler, so rustc itself will still use jemalloc but Rust programs by default will not.

@sfackler
Copy link
Member

If you really want to talk to specifically glibc malloc, you can link to __libc_malloc, but that's probably not a portable thing.

@SimonSapin
Copy link
Contributor Author

@alexcrichton Right, I think that split between rustc and std is probably the best eventual outcome.

@sfackler Interesting. I think we still want alloc_system to call plain malloc, though. For example Firefox redefines malloc (in a fork of an old version of jemalloc) and expects Rust dynamic libraries to use it.

@RalfJung
Copy link
Member

It's not great to have two competing allocators both running at the same time in a single process.

I can see that point. But then we should not be providing APIs that pretend to do this, while they actually do not.

@SimonSapin
Copy link
Contributor Author

SimonSapin commented Nov 21, 2017

TL;DR: let’s switch rustc to unprefixed jemallocator and restore symbol prefixes in alloc_jemalloc now?


So, we’re discussing a number of desirable but apparently competing points. I think we can have our cake and eat it too. I’m gonna name them to untangle everything without repeating lengthy phrases over and over.

  • A. Stable users can choose to have std::heap::Heap in Rust executables use jemalloc
  • B. Stable users can choose to have std::heap::Heap in Rust executables use the system allocator
  • D. Rustc and LLVM-in-rustc use jemalloc (for that 10% perf improvement)
  • E. C/C++ libraries that uses malloc linked with a Rust program end up using the same allocator as std::heap::Heap
  • F. std::heap::System uses the system allocator even if not selected for std::heap::Heap with #[global_allocator]: fixing this bug

Currently we have D and E on some platforms since alloc_jemalloc configures jemalloc without a symbol prefix, and A since it’s the default.

There’s a number of changes we can make. These are not blocked as far as I can tell:

  • 1. Make alloc_jemalloc always configure jemalloc with a symbol prefix.
  • 2. Remove alloc_jemalloc and make B the default
  • 3. Add an unprefixed Cargo feature to https://crates.io/crates/jemalloc-sys (and perhaps forward it through https://crates.io/crates/jemallocator) so that it configures jemalloc without a symbol prefix. (The default would still be --with-prefix=_rjem_.)
  • 4. Make rustc use jemallocator with 3 instead of alloc_jemalloc

This is blocked on API design decisions:

With 1 alone we gain F, but lose D and E.

With 2 alone we gain B and F, but lose A, D, and E. Which of A or B is more desirable (if we have to choose) is debatable. Adding 5 restores A.

#33082 (comment) suggests that doing 2 is planned, but not until until 5 is solved (presumably to avoid losing A). At that point we’ll likely also want to avoid losing D (or E). Doing 3 and 4 seems to be to be the easiest way to achieve that.

However we don’t need to wait for 5 to be stabilized before doing 3 or 4. So I suggest doing 3, 4, then 1 now-ish. Both in order to fix F, and to be ready to do 2 later when 5 is unblocked. The only point we would temporarily lose is E for stable users.

@alexcrichton, @sfackler, what do you think?

@alexcrichton
Copy link
Member

Seems plausible to me!

I'm remembering now though that this probably won't work unfortunately. The standard library is created as a dynamic library which currently fixes the allocator to jemalloc (alloc_jemalloc that is). That will be required to get fixed first before we can have alloc_jemalloc and jemallocator

@SimonSapin
Copy link
Contributor Author

Any mention of "use jemallocator" above implies selecting it with #[global_allocator]. My understanding is that alloc_jemalloc and its copy of jemalloc are only linked when #[global_allocator] is not used. Otherwise jemallocator could never be used at all.

@alexcrichton
Copy link
Member

Er yes I think I understand what you're advocating for, and it sounds like a great plan. What I mean is that if you do it you'll get a compile time error and it will fail to compile.

@SimonSapin
Copy link
Contributor Author

I don’t understand what error that would be.

@alexcrichton
Copy link
Member

Er yes that's what I mean by "fixing the allocator" and "required to get fixed", it's a bit obscure... Turns out it's not a compile time error (but it should be right now) but you can observe the behavior with:

#![feature(global_allocator, allocator_api)]

use std::heap::*;

#[global_allocator]
static A: B = B;

struct B;

static mut HIT: bool = false;

unsafe impl<'a> Alloc for &'a B {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
        HIT = true;
        System.alloc(layout)
    }
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout) {
        HIT = true;
        System.dealloc(ptr, layout)
    }
}

fn main() {
    println!("hello!");
    assert!(unsafe { HIT });
}

That program "accidentally" succeeds on Linux due to how the dynamic linker works but it fails on Windows where the dynamic linker works differently.

@alexcrichton
Copy link
Member

Er, when compiling that program with -C prefer-dynamic, that is, like how the compiler is compiled.

@alexcrichton
Copy link
Member

This is a pretty old issue now and I think is something we're probably not going to change to preserve performance of rustc, so I'm going to close this in favor of #36963 where we'll intentionally be routing malloc through jemalloc in the compiler specifically because we want LLVM to use jemalloc

@RalfJung
Copy link
Member

@alexcrichton Just to be clear, it would still be wrong if outside the compiler, calling System::alloc would invoke jemmalloc, right? You just consider fixing that behavior part of #36963 now?

@alexcrichton
Copy link
Member

@RalfJung it's not really wrong per se in the sense that this is how jemalloc/Linux are designed. The "system allocator" on Linux isn't glibc malloc but rather the malloc/free symbols. It just so happens that jemalloc's build by default overrides those symbols.

This is also intentional behavior that we desired on OSX/Linux because we want LLVM's usage of malloc and free to get routed to one global allocator, jemalloc, which is shared with rustc

@SimonSapin
Copy link
Contributor Author

It’s not wrong for System::alloc to use jemalloc is configured to hook itself into the system allocator. However it’s definitely unexpected IMO that Rust does so by default. #55238 should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

No branches or pull requests

5 participants