Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move compiler-rt build into a crate dependency of libcore #34400

Closed
alexcrichton opened this issue Jun 21, 2016 · 42 comments · Fixed by #35021
Closed

Move compiler-rt build into a crate dependency of libcore #34400

alexcrichton opened this issue Jun 21, 2016 · 42 comments · Fixed by #35021
Labels
E-help-wanted Call for participation: Help is requested to fix this issue. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

Comments

@alexcrichton
Copy link
Member

One of the major blockers of our dream to "lazily compile std" is to ensure that we have the ability to compile compiler-rt on-demand. This is a repository maintained by LLVM which contains a large set of intrinsics which LLVM lowers function calls down to on some platforms.

Unfortunately the build system of compiler-rt is a bit of a nightmare. We, at the time of the writing, have a large pile of hacks on its makefile-based build system to get things working, and it appears that LLVM has deprecated this build system anyway. We're trying to move to cmake but it's still unfortunately a nightmare compiling compiler-rt.

To solve both these problems in one fell swoop, @brson and I were chatting this morning and had the idea of moving the build entirely to a build script of libcore, and basically just using gcc-rs to compile compiler-rt instead of using compiler-rt's build system. This means we don't have to have LLVM installed (why does compiler-rt need llvm-config?) and cross-compiling should be much more robust/easy as we're driving the compiles, not working around an opaque build system.

To make matters worse in compiler-rt as well it contains code for a massive number of intrinsics we'll probably never use. And even worse these bits and pieces of code often cause compile failures which don't end up mattering in the end. To solve this problem we should just whitelist a set of intrinsics to build and ignore all others. This may be a bit of a rocky road as we discover some we should have compiled but forgot, but in theory we should be able to select a subset to compile and be done with it.

This may make updating compiler-rt difficult, but we've already only done it once in like the past year or two years, so we don't seem to need to do this too urgently. This is a worry to keep in mind, however.

Basically here's what I think we should do:

  • Add a build script to libcore, link gcc-rs into it
  • Compile select portions of compiler-rt as part of this build script, using gcc-rs
  • Disable injection of compiler-rt in the compiler

Staging this is still a bit up in the air, but I'm curious what others think about this as well.

cc @rust-lang/tools
cc @brson
cc @japaric

@brson
Copy link
Contributor

brson commented Jun 21, 2016

This can probably be done one platform at a time. Get it working on x86, then add a flag to the target spec that compiler rt isn't required.

While I want to do this, one thing that bugs me about it is that it ruins the 'purity' of core. Right now core is a pure-Rust library, really easy to build, although you can't do anything with it in practice without compiler-rt. Maybe there's nothing to do about it and I just need to accept that Rust isn't completely pure. For now, at least. Once we've got the build system transitioned and isolated just the parts of compiler-rt that core needs, we can start thinking about rewriting compiler-rt in Rust.

In light of those considerations, I think I'd prefer it if compiler-rt was its own crate that core depends on. Just seems like a cleaner division of responsibility.

@japaric
Copy link
Member

japaric commented Jun 21, 2016

I was going to open a similar issue because I have already started implementing this idea here (but compiler-rt in its own crate rather than under core) 😄. Which is I have been using to cross compile compiler-rt for ARM Cortex-M processors.

👍 from me. I pretty much agree with @alexcrichton assessment:

This means we don't have to have LLVM installed (why does compiler-rt need llvm-config?)

nods the user won't need to have cmake or make installed either.

This may make updating compiler-rt difficult, but we've already only done it once in like the past year or two years, so we don't seem to need to do this too urgently.

Agree. I might add that updating our hacks on top of their build system across compiler-rt updates is going to be equally difficult anyway.

re: steps

Compile select portions of compiler-rt as part of this build script, using gcc-rs

I think we can start with all the "generic" C routines. IME, those (cross) compile fine to all the architectures I have tried (x86, arm, mips). The arch-specific assembly (*.s) routines are more troublesome, for example cross compiling most of the arm/*.s routines to Cortex-M fail because they have been writing with the Cortex-A target in mind (the routines use ARM mode, which is not available in Cortex-M processors; the routines use Thumb instructions that are not available on ARMv6 Cortex-M processors; etc.). Those can be added in add as needed/tested.

Disable injection of compiler-rt in the compiler

I'm not quite sure what this means. Surely you don' mind stop passing the -lcompile-rt flag to the linker. But, rather remove the compiler-rt compilation step from bootstrap?

@brson

IMO core doesn't need to depend on building compiler-rt. no_std programs don't always depend on compiler-rt. In fact, I think that people doing x86 kernel dev don't bother compiling compiler-rt because they don't need the intrinsics. Also, you can get quite far building programs for embedded Cortex-M microcontrollers without using compiler-rt. For example, zinc doesn't use compiler-rt.

OTOH, compiler-rt is mandatory when you build programs that depend on std because the compiler passes -lcompiler-rt to the linker. Perhaps we could make compiler-rt an optional dependency of core?

@steveklabnik
Copy link
Member

In fact, I think that people doing x86 kernel dev don't bother compiling compiler-rt because they don't need the intrinsics.

Can confirm; I only use libcore, not compiler-rt.

@alexcrichton
Copy link
Member Author

@japaric

Those can be added in add as needed/tested

True yeah, I kinda would prefer to build nothing and then only add in intrinsics as we actually run into them. This may not be a great test, however, as we may accidentally pick up routines from libgcc all the time instead.

I'm not quite sure what this means

Oh yeah I just mean the compiler injecting -lcompiler-rt as it'd either be bundled in libcore or part of another library so this would just be redundant. This is just a minor technical detail, nothing too much to worry about.

In fact, I think that people doing x86 kernel dev don't bother compiling compiler-rt because they don't need the intrinsics.

Yeah it's true that most programs don't, but for us to be robust we actually need to deal with this. Functions like 64-bit multiplication with overflow on 32-bit end up lowering down to these intrinsics I believe, and a few others here and there.

Basically I don't think we can get by with not shipping a compiler-rt, so for tier 1 platforms we need to at least have some form of it somewhere. This should in theory then extend very nicely to embedded platforms as they should all be able to compile C anyway, so we just need to be sure the compiles actually succeed.

@japaric
Copy link
Member

japaric commented Jun 21, 2016

Also thinking far off in the future if we go down this road:

  • Are we going to build sanitizers (cf [WIP] Sanitizers support. #31605) this way (with gcc) too? The sanitizers live in compiler-rt as well. Actually this may make integration of santizers easier (as a Cargo feature) but may make compiler-rt updates hard. (I'm speculating)
  • As cross compilation of std and of compiler-rt becomes (trivially) easy. People are bound to get nasty (linker) errors if they cross compile compiler-rt to a target where the gcc-rs defaults the compiler to cc. We should prevent those problems (using cc to cross compile) in the build script and provide nice error messages: "please set the CC_my_target and AR_my_target env variables".

@alexcrichton
Copy link
Member Author

I would personally expect sanitizers to be present in separate crates, not dependencies of libcore (or part of libcore). I'd like to support them eventually but I'd also like to build them separately.

Also yeah I think that the fallback to cc is basically wrong in almost all situations so that should just be an error outright no matter what (with messages about env vars and such)

@brson
Copy link
Contributor

brson commented Jun 21, 2016

I'm not quite sure what this means. Surely you don' mind stop passing the -lcompile-rt flag to the linker. But, rather remove the compiler-rt compilation step from bootstrap?

I would expect this. Let's package it up the same as any other crate, so the compiler doesn't specifically know about it.

IMO core doesn't need to depend on building compiler-rt. no_std programs don't always depend on compiler-rt. In fact, I think that people doing x86 kernel dev don't bother compiling compiler-rt because they don't need the intrinsics. Also, you can get quite far building programs for embedded Cortex-M microcontrollers without using compiler-rt. For example, zinc doesn't use compiler-rt.

OTOH, compiler-rt is mandatory when you build programs that depend on std because the compiler passes -lcompiler-rt to the linker. Perhaps we could make compiler-rt an optional dependency of core?

I think @alexcrichton's idea was that no consumer of Rust should ever have to think about compiler-rt. It just comes with core as appropriate for the target, and everybody already depends on core. If a particular kernel project doesn't actually depend on the symbols defined in compiler-rt then that's fine - the linker will just throw them away, just like every other unused part of core. If a particular platform doesn't need compiler-rt at all, then core just won't link to it.

@japaric
Copy link
Member

japaric commented Jun 21, 2016

@brson I see what @alexcrichton meant now. Thanks for elaborating!

@alexcrichton

True yeah, I kinda would prefer to build nothing and then only add in intrinsics as we actually run into them.

So, start with nothing, run the full Rust test suite and crater, add missing symbols and repeat.

This may not be a great test, however, as we may accidentally pick up routines from libgcc all the time instead.

You mean that libgcc also provides intrinsics with the same name? And those get picked up instead of compiler-rt's?

Oh yeah I just mean the compiler injecting -lcompiler-rt as it'd either be bundled in libcore or part of another library so this would just be redundant. This is just a minor technical detail, nothing too much to worry about.

I see what you mean now. Yeah that would be better. It would let us drop (as in "stop caring about") the no-compiler-rt field. I think I tried something like this before but the linker dropped the libcompiler-rt.rlib symbols when linking a static no_std binary; it could have just been a linker argument ordering issue though.

I would personally expect sanitizers to be present in separate crates, not dependencies of libcore (or part of libcore). I'd like to support them eventually but I'd also like to build them separately.

I meant building the sanitizers with gcc-rs instead of with llvm's cmake build system. I agree they should not be part of libcore. They would probably end up in their own crate below the std crate.

@alexcrichton
Copy link
Member Author

So, start with nothing, run the full Rust test suite and crater, add missing symbols and repeat.

Crater may not be too much help here but if we can get past the auto bots and compile all necessary symbols for all platforms I suspect that'd be all we need. We tend to be pretty good about coverage of the standard library in general, although not bullet proof of course.

You mean that libgcc also provides intrinsics with the same name? And those get picked up instead of compiler-rt's?

Yeah I believe that at least some intrinsic names overlap, although perhaps not all of them. Most of the time you end up linking to libgcc in one way or another if you start including some C code, so we may require some intrinsic names that end up getting pulled in from libgcc instead of from compiler-rt as before.

In general though this is just a vague worry. If things compile and link just fine then it doesn't really matter too much where it comes from, presumably we'll solve it at some point.

I meant building the sanitizers with gcc-rs instead of with llvm's cmake build system. I agree they should not be part of libcore. They would probably end up in their own crate below the std crate.

Ah for these we'd probably want to use cmake. I know far less about the sanitizers and they seem like they would be more complicated. That and the sanitizers don't need to work on every platform rustc supports so we may not need to cut corners.

@vadimcn
Copy link
Contributor

vadimcn commented Jun 22, 2016

A crazy idea: could we translate all the builtins that we need into Rust? The .c code in compiler-rt doesn't look too complicated. Then we'd depend only on assembler (and we could even bundle llvm-as with Rust).

Of course, this will increase the cost of updating, but as @brson said, the builtins subset of compiler-rt seems to be pretty stable.

Edit: nevermind, brson has already proposed that. Serves me right for reading on a phone!

@alexcrichton
Copy link
Member Author

Yeah at this point I'm about ready to just throw in the towel on using vanilla upstream compiler-rt, it's too much of a pain so far and doesn't seem to have enough gain. If we can figure out what LLVM actually needs (in terms of intrinsics), I'd be totally ok adding those to libcore (or some other lib) and just going from there.

@japaric
Copy link
Member

japaric commented Jun 22, 2016

If we can figure out what LLVM actually needs (in terms of intrinsics)

I think this command yields most of (if not all) the intrinsics LLVM uses:

$ grep 'RTLIB::.*"' src/llvm/lib/Target/*/*Lowering.cpp

The output looks like this (Spoilers there are lots of them).

@vadimcn
Copy link
Contributor

vadimcn commented Jun 22, 2016

I think these are just platform-specific overrides. The full list is here.

@japaric
Copy link
Member

japaric commented Jun 22, 2016

The full list is here.

Nice find! I think not all of those intrinsics are used though. For example, __ashlhi3 only appears once in the (LLVM) code base and there's no implementation of it in the compiler-rt repo.

@vadimcn
Copy link
Contributor

vadimcn commented Jun 22, 2016

Could we simply pre-build all possible versions of compiler-rt and distribute it as a binary? I think the set of (CPU targets * calling conventions) is fixed by LLVM implementation, so it should be feasible if we used clang?

@alexcrichton
Copy link
Member Author

We tend to always run into trouble with distributions of binaries, and with things like target-specific flags and such I'd suspect that we can't really ship an end-all-be-all build of compiler-rt. I'd personally prefer to get it working building from source so we can just build everything from source eventually.

@brson
Copy link
Contributor

brson commented Jun 22, 2016

A crazy idea: could we translate all the builtins that we need into Rust?

This should definitely be part of the long term plan.

@Amanieu
Copy link
Member

Amanieu commented Jun 22, 2016

The 128-bit integer RFC requires rewriting some of the compiler-rt builtins in Rust.

@japaric
Copy link
Member

japaric commented Jun 23, 2016

@vadimcn

The full list is here.

Actually, that list doesn't contain the __aeabi_* intrinsics that LLVM emits when working with FPUless 32-bit ARM processors like the Cortex-M. I think that it doesn't include other platform specific intrinsics either but I'm not familiar with those.

@alexcrichton

Yeah I believe that at least some intrinsic names overlap ... so we may require some intrinsic names that end up getting pulled in from libgcc instead of from compiler-rt as before

I wrote a script to get a list of the intrinsics that overlap; the result is in this gist. Note that the LLVM intrinsics listed there are a (probably overestimated) guess; we haven't found a definite list of intrinsics that LLVM may emit on all platforms.

@vadimcn
Copy link
Contributor

vadimcn commented Jun 23, 2016

Actually, that list doesn't contain the _aeabi* intrinsics that LLVM emits when working with FPUless 32-bit ARM processors like the Cortex-M. I think that it doesn't include other platform specific intrinsics either but I'm not familiar with those.

The list I pointed to is a default initialization. The target backends then override some of the entries with platform-specific names, so all _aeabi stuff is in ARMISelLowering.cpp.

@alexcrichton alexcrichton added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Jul 1, 2016
@brson brson added the E-help-wanted Call for participation: Help is requested to fix this issue. label Jul 1, 2016
@brson
Copy link
Contributor

brson commented Jul 1, 2016

I tagged this with help wanted assuming no one is otherwise going to jump on this. Someone a bit familiar with how rustc and std interpenetrate could get started creating a compiler-rt crate and linking it in somewhere.

@cuviper
Copy link
Member

cuviper commented Jul 6, 2016

It may be a tangent, but would it be possible for this to use a system compiler-rt? Fedora has a compiler-rt package, and I verified its libclang_rt.builtins-x86_64.a has the same symbols.

@alexcrichton
Copy link
Member Author

@cuviper perhaps yeah, if we follow the route in this issue it may be relatively trivial with Cargo's library override support depending on how we do it.

@alexcrichton
Copy link
Member Author

We're upgrading compiler-rt in #34743 as part of an LLVM upgrade, and it's again unfortunately quite an ordeal due to the sheer number of patches we need to apply (many of which don't apply cleanly any more).

@cuviper
Copy link
Member

cuviper commented Jul 15, 2016

@alexcrichton Can you give a brief overview of what sort of compiler-rt patches there are? That will help me decide the feasibility of using the distro package. (FWIW, I already have makefile patches semi-working to allow external compiler-rt, and hope to send a PR soon.)

@cuviper
Copy link
Member

cuviper commented Jul 23, 2016

OK, sounds good, thanks.

@japaric
Copy link
Member

japaric commented Jul 24, 2016

Potential next steps:

I'm working on this. I think I got everything but the Makefiles done. I'm smoke testing right now by cross compiling std to arm-unknown-linux-gnueabi. I want to cross compile a binary that actually uses compiler-rt intrinsics.

@bstrie
Copy link
Contributor

bstrie commented Aug 2, 2016

One of the major blockers of our dream to "lazily compile std"

@alexcrichton Is there an open issue for this somewhere? I'd like to know what this means in more detail.

@alexcrichton
Copy link
Member Author

@bstrie it's this issue! We just need to be sure to lazily compile compiler-rt as part of cargo build of the standard library, but that's what this issue is about.

@briansmith
Copy link
Contributor

If compiler-rt were rewritten in Rust, then would non-Rust code be able to use the Rust-coded version? For example, let's say I've linked a program using 4 LLVM-based programming languages together. Would I have multiple copies of the compiler-rt logic linked into my program? That would suck. A much simpler solution to rewriting compiler-rt would be to just include llvm-as and clang in the Rust distribution, so that we can cross-compile the asm and C code when (re)building std. Besides eliminating the need to rewrite compiler-rt, this would benefit lots of users of Rust as they could then rely on llvm-as and clang being available.

@japaric
Copy link
Member

japaric commented Aug 4, 2016

If compiler-rt were rewritten in Rust, then would non-Rust code be able to use the Rust-coded version?

Presumably yes, the intrinsics in the Rust version will be exposed in the same way as the intrinsics in libcompiler-rt.a. (Side note: On failure, the compiler-rt intrinsics call an "abort" function defined in compiler-rt; I'm not sure if that should become panic on the Rust version or if we should port the abort function to Rust -- this has not been discussed yet but the former path would hinder this use case you mention)

For example, let's say I've linked a program using 4 LLVM-based programming languages together. Would I have multiple copies of the compiler-rt logic linked into my program?

No, the linker will use the version that "encounters first" and ignore the rest. Here first means: from all the objects passed to the linker as arguments, the symbol found in the first object (argument) that provides it. Note that this happens today for all std programs (at least on Linux): Rust programs are linked against gcc_s and against compiler-rt and both provide "compiler-rt" intrinsics like __floattidf (see nm -D libgcc_s.so | grep __ and nm libcompiler-rt.a | grep __)

alexcrichton pushed a commit to japaric/rust that referenced this issue Sep 13, 2016
libcompiler-rt.a is dead, long live libcompiler-builtins.rlib

This commit moves the logic that used to build libcompiler-rt.a into a
compiler-builtins crate on top of the core crate and below the std crate.
This new crate still compiles the compiler-rt instrinsics using gcc-rs
but produces an .rlib instead of a static library.

Also, with this commit rustc no longer passes -lcompiler-rt to the
linker. This effectively makes the "no-compiler-rt" field of target
specifications a no-op. Users of `no_std` will have to explicitly add
the compiler-builtins crate to their crate dependency graph *if* they
need the compiler-rt intrinsics. Users of the `std` have to do nothing
extra as the std crate depends on compiler-builtins.

Finally, this a step towards lazy compilation of std with Cargo as the
compiler-rt intrinsics can now be built by Cargo instead of having to
be supplied by the user by some other method.

closes rust-lang#34400
bors added a commit that referenced this issue Sep 13, 2016
crate-ify compiler-rt into compiler-builtins

libcompiler-rt.a is dead, long live libcompiler-builtins.rlib

This commit moves the logic that used to build libcompiler-rt.a into a
compiler-builtins crate on top of the core crate and below the std crate.
This new crate still compiles the compiler-rt instrinsics using gcc-rs
but produces an .rlib instead of a static library.

Also, with this commit rustc no longer passes -lcompiler-rt to the
linker. This effectively makes the "no-compiler-rt" field of target
specifications a no-op. Users of `no_std` will have to explicitly add
the compiler-builtins crate to their crate dependency graph *if* they
need the compiler-rt intrinsics - this is a [breaking-change]. Users
of the `std` have to do nothing extra as the std crate depends
on compiler-builtins.

Finally, this a step towards lazy compilation of std with Cargo as the
compiler-rt intrinsics can now be built by Cargo instead of having to
be supplied by the user by some other method.

closes #34400

---

r? @alexcrichton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-help-wanted Call for participation: Help is requested to fix this issue. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants