-
-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leverage LLVM bazel support #130
Comments
Hi! I'm aware of the Bazel overlay that's been merged into LLVM proper; unfortunately I don't think it's straight-forward to make use of it in this toolchain (though having this repository have an option to build clang and friends from source instead of using prebuilt toolchains is definitely desirable for host platform support reasons). When trying to have these kind of "bootstrapped" toolchain setups (i.e. a Until we have an official starlark-only Creating a minimal starlark Alternatively, is there a specific reason you want this kind of functionality? As mentioned I don't think it's straight-forward to modify this project to build the toolchain it registers from source but if you want that sort of thing for the purpose of supporting a particular platform or a particular LLVM version or some other similar use case, we could definitely look at just supporting that use case. |
@rrbutani you raised a really good point 🤔 I was trying to get around the need of us having to patch and re-package LLVM internally recently. The repackaging of LLVM into a Creating a This will be beneficial to have considering that (1) we are currently monkey patching LLVM and (2) LLVM project does not provide a distribution for Darwin arm64(aarch64), a future requirement of ours. |
As mentioned this is something I'm interested in trying to do (a minimal starlark-only
This sounds great; I'm interested in having something like this for similar reasons (we're also interested in arm64 Darwin support (#88) as well as supporting other host platforms and configurations for which there aren't really suitable official LLVM distributions). Having nice bazel targets that produce the tarballs isn't a necessity for us (for this project we'd just be looking to build things on upstream releases; every 6 months excluding patch releases, definitely in the realm of what can be done manually without artifact caching) but it'd definitely make things much nicer. I'm happy to help with this in any way that'd be useful. |
@sluongng I took another look at this today; I'm fairly confident that this (which made it into Bazel 5.0) makes it so that we can toolchain transition across I'm going to try to put together a small example to confirm. |
Okay yeah, seems to work! |
We'd just need to offer a way for the repo rules in this package to take labels to binaries/filegroups instead of just repositories and archive URLs. From there we can add the necessary plumbing to turn this into a top-level option if we want (handling the bootstrapping, etc.; maybe we can grow a I'm probably not going to get to this until at least the end of the month or so; I want to resolve some of the other issues first. But definitely feel free to get started on this if you want. |
@neqochan Thanks! I think we'd probably want to take a slightly different approach for I have not been actively working on this yet but ^ raises some very good questions. In particular, I hadn't considered what we'd do for It's out of scope for a first pass at this but eventually I'd like to offer some way to build actually hermetic toolchains this way (i.e. statically linked against Regardless I think a good first step is making the changes to |
From an end use perspective it may actually be more desirable to have something like a "simple" In |
I think building compiler or any of the runtime components will be kind of a last resort for users; only for people using platforms we don't support and have binary distributions for. I think it's still worth having a flow that builds from source to support use cases like the one described by the OP (patches to LLVM, active compiler development, etc.).
This is definitely a good suggestion and something we should definitely look into if building |
@sluongng See https://github.com/dzbarsky/static-clang. Unfortunately I think the upstream build rules are not fully cacheable, I keep getting misses on my GHA build runners that are using buildbuddy RBE. So I think building from source and relying on caching would be painful, better to use prebuilt tarball like the ones generated by that repo. |
@dzbarsky I ran into similar issues as you describe though. For some reason the GHA env seems to not work. Something I haven't tried for GHA, but that I'm using for other setups is using the stdenv from nix. Using that as the env for a GHA run might help (but probably requires a few manual tweaks). Maybe some |
@aaronmondal Do you have an example of the nix setup? I was thinking I could also run the build from inside a docker container on GHA, that might isolate things enough to get it reproducible. I thought I got the bazelrc flags that are important for hermeticity but it looks like at least |
@dzbarsky The setup I use is rougly this:
All of this was tested on several machines, manually configured against buildbuddy, and we had (except for libcxx modules, which are not part of the standard LLVM Bazel overlay) perfect cache hit rate across all x86_64-linux machines. |
Oh one thing to note though: The nix environments are nicely reproducible, but that also means that they contain everything you need during the build. I believe the original Image I had for those tests was ~20GB, and apparently the images in the current setup are closer to ~80GB as they now include cuda and several different compiler setups. This means that this approach isn't really feasible for hosted runners. I'd say this mosly just makes sense when you have a self-hosted setup where the image build pipelines are part of your internal CI. |
@sluongng Yeah I just haven't had time to look into it yet. But I kicked off a few builds to collect the data we need, and am going to poke around now. If you do have some cycles and want to look, I wouldn't say no :) I added a dummy --action_env var to bust the cache, then kicked off 3 builds, waiting for the previous one to complete Runner build 1 (empty cache): https://app.buildbuddy.io/invocation/a5158393-8883-43de-9f35-069fe525e032# |
Ah, looks like an issue with the Zig toolchain uber/hermetic_cc_toolchain#89 (comment) |
I am glad that you were able to put our UI to use 💪 And feel free to send any feedbacks my ways 🤗 |
@sluongng Sure. I knew that zlib was one of the first actions topologically, so I took a look at a compilation across the builds. Here's what I saw: I just traced down the input tree following the mistmatched digests until I found the Now here's the fun part; this is created by a repository rule, so it's not running in RBE if I understand correctly.
The compilation happens here: Any ideas where to go from here? |
I would download the 2 binaries and try to diff them (or hex dump and diff). You should be able to tell what's inconsistent between the 2. Funny enough, I am also looking at that very code path today. I wonder if you are retaining |
Got to the bottom of it - uber/hermetic_cc_toolchain#124 |
There is not much for us to do here. People can already use Bazel packages as toolchain roots, as long as some convention around target names is followed. |
Google folks have contributed Bazel support for LLVM in the llvm-project tree.
The full configuration could be found here https://github.com/llvm/llvm-project/tree/main/utils/bazel
and sample usages could be found here https://github.com/llvm/llvm-project/blob/main/utils/bazel/examples/http_archive/WORKSPACE
It seems like most of what we need to configure LLVM toolchain are available as bazel targets
Here is the full list of binaries exposed
The text was updated successfully, but these errors were encountered: