Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crosscompile: Add an example for cross compiling #402

Merged
merged 4 commits into from
Aug 25, 2023

Conversation

DolceTriade
Copy link
Contributor

This adds an example for cross compiling on an ARM M1 Mac to x86_64 Linux. There are rough edges around this example like it will only compile for Linux and won't work for OSX, but I think that this problem should be discussed in the better API issue.

Also I realized I missed a bug in the previous cross compilation PR, where when cross compiling, we probably want to use the linux_wrapper for building otherwise we end up trying to install_name_tool the final Linux binaries, which doesn't work.

[nix-shell:~/code/rules_nixpkgs/examples/toolchains/cc_cross_osx_to_linux_amd64]$ bazel build --config=cross :hello
INFO: Invocation ID: f10b420f-2dca-42d3-ab4b-1179862791ff
INFO: Analyzed target //:hello (3 packages loaded, 19 targets configured).
INFO: Found 1 target...
Target //:hello up-to-date:
  bazel-bin/hello
INFO: Elapsed time: 8.212s, Critical Path: 5.59s
INFO: 4 processes: 2 internal, 2 processwrapper-sandbox.
INFO: Build completed successfully, 4 total actions

[nix-shell:~/code/rules_nixpkgs/examples/toolchains/cc_cross_osx_to_linux_amd64]$ file bazel-bin/hello
bazel-bin/hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/jvjrf7rqflfx4nf65yrkk697sxggffki-glibc-x86_64-linux-2.35-224/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, not stripped

note that when compiling for the first time, it'll build clang and gcc a bunch of times for cross compilations so the initial build will take a few hours...

This adds an example for cross compiling on an ARM M1 Mac to x86_64 Linux.
There are rough edges around this example like it will *only* compile for
Linux and won't work for OSX, but I think that this problem should be discussed
in the better API issue.

Also I realized I missed a bug in the previous cross compilation PR, where when cross compiling,
we probably want to use the linux_wrapper for building otherwise we end up trying to `install_name_tool`
the final Linux binaries, which doesn't work.

```
[nix-shell:~/code/rules_nixpkgs/examples/toolchains/cc_cross_osx_to_linux_amd64]$ bazel build --config=cross :hello
INFO: Invocation ID: f10b420f-2dca-42d3-ab4b-1179862791ff
INFO: Analyzed target //:hello (3 packages loaded, 19 targets configured).
INFO: Found 1 target...
Target //:hello up-to-date:
  bazel-bin/hello
INFO: Elapsed time: 8.212s, Critical Path: 5.59s
INFO: 4 processes: 2 internal, 2 processwrapper-sandbox.
INFO: Build completed successfully, 4 total actions

[nix-shell:~/code/rules_nixpkgs/examples/toolchains/cc_cross_osx_to_linux_amd64]$ file bazel-bin/hello
bazel-bin/hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/jvjrf7rqflfx4nf65yrkk697sxggffki-glibc-x86_64-linux-2.35-224/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, not stripped
```

note that when compiling for the first time, it'll build clang and gcc a bunch of times for cross compilations so the initial build will take a few hours...
Copy link
Member

@benradf benradf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR @DolceTriade, and sorry for the delay in reviewing on my part.

note that when compiling for the first time, it'll build clang and gcc a bunch of times for cross compilations so the initial build will take a few hours...

We might need to make CI skip this example then, unless we can get a Nix cache with the appropriate cross-compilers prebuilt. It looks like the MacOS Examples job is currently failing for another reason though:

ld: library not found for -l:libboost_atomic.so.1.75.0

Any ideas? I'm trying this out myself on a MacOS machine now, but it'll probably take a while.


To run the example with Nix, issue the following command:
```
nix-shell --command 'bazel run --config=cross:hello'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nix-shell --command 'bazel run --config=cross:hello'
nix-shell --command 'bazel run --config=cross :hello'

@benradf
Copy link
Member

benradf commented Aug 18, 2023

I'm trying this out myself on a MacOS machine now, but it'll probably take a while.

It took about 50 mins, but I have a Linux ELF binary now 🙂

$ file bazel-bin/hello
bazel-bin/hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/jvjrf7rqflfx4nf65yrkk697sxggffki-glibc-x86_64-linux-2.35-224/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, not stripped

I'm having trouble running it on the Linux system I copied it to though. Looks like I also need to copy a bunch of shared libraries it was linked against. Will take a more in-depth look at this next week.

I wonder if we can bundle the binary and its deps up into a docker image? Then the example would match a common use case.

@DolceTriade
Copy link
Contributor Author

Yeah, can definitely do that. What's the state of the art around generating docker images for nix? I see a PR in progress around this. Do we use recursive nix and just call bazel from a nix flake or package?

@benradf
Copy link
Member

benradf commented Aug 21, 2023

I was thinking you could use rules_oci for this. I had a quick go at doing this here and managed to get it to build an image. Unfortunately when I try to run the container, it fails with "file not found". It's probably missing some deps and / or the search path / interpreter are wrong. I need to look into this further.

If you want to take a look yourself, I've just been running the container with docker run -it cross_example:latest and then exporting the flattened filesystem with docker export container_name | tar -x -C /tmp/container_name. Then you can use ldd and other tools to examine the hello binary and see what shared libraries are present in the image.

@DolceTriade
Copy link
Contributor Author

I can probably take a look. Thanks for the example.

@benradf
Copy link
Member

benradf commented Aug 24, 2023

I made a little progress by disabling linking against boost and zlib to simplify things. I also figured out the correct ld-linux interpreter to package. Now the container no longer fails with exec /hello: no such file or directory. Instead it fails with the following error:

/hello: error while loading shared libraries: libgcc_s.so.1: cannot open shared object file: No such file or directory

This file is in the image, and apparently on the RPATH of the hello executable, so I'm not sure yet why it's not being found.

@DolceTriade
Copy link
Contributor Author

❯ bazel run --config=cross :hello_image_tarball
INFO: Invocation ID: c6ef346e-c317-47a0-ae99-88d1c98acf22
INFO: Analyzed target //:hello_image_tarball (1 packages loaded, 7 targets configured).
INFO: Found 1 target...
Target //:hello_image_tarball up-to-date:
  bazel-bin/hello_image_tarball/tarball.tar
INFO: Elapsed time: 5.376s, Critical Path: 4.09s
INFO: 3 processes: 1 internal, 2 darwin-sandbox.
INFO: Build completed successfully, 3 total actions
INFO: Running command line: bazel-bin/hello_image_tarball.sh
376ccaf81658: Loading layer [==================================================>]  8.994MB/8.994MB
3383626f4a44: Loading layer [==================================================>]  56.24MB/56.24MB
Loaded image: cross_example:latest

❯ docker run --rm -it --platform linux/amd64 cross_example
Hello world!

Thanks for your examples! I was able to get it running. Honestly, I think the weakest aspect of rules_nixpkgs is that its deployment story is not super clear.

I think that the sanest option for deployment of bazel binaries is just to use recursive nix, otherwise you have to manage dependencies in the BUILD files and in your nix docker base image, which is pretty ugly and confusing with nix's more obscure compiler deps.

What we do internally is probably less kosher, but basically we just use ldd and nix copy to find all the deps of the binaries and their other rando deps, and copy it into a closure using a bash script and build an appimage from that...but that was before I knew that recursive nix was a thing.

@benradf
Copy link
Member

benradf commented Aug 25, 2023

Honestly, I think the weakest aspect of rules_nixpkgs is that its deployment story is not super clear.

Yeah, this is definitely something that should be improved. I'm pleased we have this rules_oci example working now, but perhaps using dockerTools and recursive Nix would be better as you say. We should explore that going forward.

@benradf benradf merged commit 6691c54 into tweag:master Aug 25, 2023
9 checks passed
@@ -123,7 +123,7 @@ def _parse_cc_toolchain_info(content, filename):
def _nixpkgs_cc_toolchain_config_impl(repository_ctx):
host_cpu = get_cpu_value(repository_ctx)
cross_cpu = repository_ctx.attr.cross_cpu or host_cpu
darwin = host_cpu == "darwin" or host_cpu == "darwin_arm64"
darwin = (host_cpu == "darwin" or host_cpu == "darwin_arm64") and cross_cpu == host_cpu

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this change, when cross-compiling from Linux to MacOS, darwin would be True and the target_libc would be "macosx", but after this change darwin is False, so target_libc is "local". Is that correct/intentional? @benradf @DolceTriade

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Is this bit correct though:

Before this change, when cross-compiling from Linux to MacOS, darwin would be True

I could well be mistaken, but in this case wouldn't host_cpu be k8, and so darwin would be False?

Either way, this needs fixing. I'll raise a pull request and we can discuss the details there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could well be mistaken, but in this case wouldn't host_cpu be k8, and so darwin would be False?

Yes, I think you are correct; I was reading that backwards, but it did look suspicious regardless! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants