Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

machine files, replacing the cross file #3972

Open
Ericson2314 opened this issue Aug 3, 2018 · 27 comments
Open

machine files, replacing the cross file #3972

Ericson2314 opened this issue Aug 3, 2018 · 27 comments

Comments

@Ericson2314
Copy link
Member

Ericson2314 commented Aug 3, 2018

Currently, the cross file specifies:

  • binaries targeting the host machine (running on the build machine)
  • information about the host machine
  • information about the target machine

Missing would be

But instead of putting all this in one cross file, what if we separate the files per machine.

Each file would contain:

  • [binaries] binaries targeting that machine (running on the build machine)
  • [machine] information about that machine
  • [properties] same as today

CLI would be:

meson ... \
  [ [ --build-file m0.txt ] [ --host-file m1.txt ] [ --target-file m2.txt ] \
  | --cross-file cf.txt \
  ]

That is either today's cross file is passed, or any combination of the 3 machine files is passed.

The rules for missing machine files are straightforward:

  • If a machine file isn't passed, and there exists a previous machine, it's definition is used.
  • If the build machine file isn't passed, the build machine's definition is inferred from environment tests

That means target default's to host, host defaults to build, and build is inferred if it's not specified. This is consistent with Autoconf's reformed behavior: https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Specifying-Target-Triplets.html. (See the link at the bottom of that page for some very odd history.)

The host machine C compiler [binaries] entry would cover most uses of CC today (autoconf CC), and the build machine C compiler [binaries] entry would cover the rest (autoconf CC_FOR_BUILD).


The first benefit of this is that machine configurations, which are independent, now are also specified independently. For NixOS/Nixpkgs, we already specify machines independently and with the same schema, so this would greatly simplify things, allowing us to have one Nix machine -> meson machine file implementation and use it always. Non-distro users would also benefit if they mix and match machine definitions in that they'll never need to copy the part of the cross file that doesn't change (maybe they were doing different hosts with the same target, or different targets with the same host).

The second benefit is cross compilation becomes implicit. Only if the host and build are different is cross compilation used. For example, one could pass --host-file m.txt on many different types of machines, and most of them would cross compile, but one of them would native compile. This allows the user to think in higher level terms of "where I want to run what I built" rather than "how I want to build this". It also supports @nirbheek's case of choosing between multiple build machines (stemming from different ABIs / processes modes supported by the same device doing the building).

The third benefit it is most directly lends itself to the deduplication and simplification I propose in #3689 (comment).


This would subsume #3282, replace #3689, and complement #3969 (as that last one is strictly about environment variables, and this is strictly about files).

@Ericson2314
Copy link
Member Author

To elaborate on @nirbheek's case:

  1. meson ... --build-file x32.txt --host-file x32.txt will do a native compilation on x32 for x32.
  2. meson ... --build-file x32.txt will also do a native compilation on x32 for x32, since host defaults to target.
  3. meson ... --build-file x86_64.txt --host-file x32.txt will do a cross compilation for x32. Any code gen utility would be built with the regular x86_64-linux-gnu toolchain.
  4. meson ... --host-file x32.txt might do a cross or native build for x32. If both toolchains are installed, build machine inference is somewhat hard to predict. Based on CC being set, which toolchain is unprefiex if neither is set, or other factors, the build machine may be inferred as the regular ABI or x32.

@nirbheek
Copy link
Member

nirbheek commented Aug 7, 2018

@jpakkane and I discussed at GUADEC that we should probably have --native-file as a mirror of --cross-file. build-file is a confusing option since the term "build file" refers to meson.build.

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 7, 2018

It is a good point that "build file" is already taken. I'd say just do clarity over brevity (people have shell history after all) and do:

  • --build-machine-file
  • --host-machine-file
  • --target-machine-file

I don't like --native-file because it implies the others are non-native, which isn't necessarily true. I also don't like --cross-file because is unnecessarily combines "host" and "target".

[More broadly, see what I wrote in #3969 (comment) about the 3 machines always being defined. The idea is native compilation is just the case where build = host = target; nothing more special than that. The alternative way of thinking where there's "native" and "cross" makes native and cross compilation are fundamentally different, leading to tons of extra is_cross and packages less likely to work under cross compilation. It may sound weird to say two ways of thinking about the same thing lead to vastly different code, but I've seen it before (https://github.com/NixOS/nixpkgs/pulls?q=is%3Apr+author%3AEricson2314+is%3Aclosed+label%3A%226.topic%3A+cross-compilation%22 ).]

@Ericson2314
Copy link
Member Author

Let me just add that https://github.com/mesonbuild/meson/pull/3921/files is a great example of on one hand the is_cross proliferation, and on the other how easy it is to forget it.

@Ericson2314
Copy link
Member Author

Bump. What do you all think of this design? I'd like to finalize something and implement it.

@jpakkane
Copy link
Member

The nat

I'm not really a fan of splitting the cross file in two. For one the target file would not really have any executables because they are all native. There is also the case that the cross compilation setup is defined by the combination of host + target. Having them in one file is nice because then every piece of information about the cross setup is in one file.

@Ericson2314
Copy link
Member Author

For one the target file would not really have any executables because they are all native.

What do you mean by that?

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 22, 2018

There is also the case that the cross compilation setup is defined by the combination of host + target.

Unless one is building a compiler, there really is no target platform. Internally we have it for simplicity, and it defaults to be the host platform, but really nothing should inspect it at all; it should be dead code.

In the cases that they are both present they very independently. And in fact, you probably want multiple target platforms. GCC's multilib effectively is that, but not expressed as such due to Autotools.

In short the target platform isn't really a nice concept: you either want 0 or many, and I rather kind of cordon it off.

@Ericson2314
Copy link
Member Author

In fact it's possible to have n host platforms and m target platforms. This would cross-compile n cross-compilers each themselves targetting m platforms a la multilib.) The tools needed to build all this are { build -> h | h in hosts} U { build -> t | t in targets }, where foo -> bar is a compiler that runs on foo and outputs for bar.

@jpakkane
Copy link
Member

What do you mean by that?

Brain fart, sorry.

Unless one is building a compiler.

Or Binutils. But yes, it is fairly rare.

his would cross-compile n cross-compilers each themselves targetting m platforms a la multilib

Yes, but you don't do that in a single build dir but instead have one per "host machine". If the outputs are capable of producing files that target multiple platforms, then that's totally fine. Maybe just have the system or one of the other entries be "multi" or something. Or ignore target completely for projects that know that their output is multiplatform.

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 22, 2018

If the outputs are capable of producing files that target multiple platforms, then that's totally fine.

But they aren't. Or rather shouldn't be. The trick to do multilib with minimal burden on the user is that each compiler-specific library should be written just like a normal library. Then when one goes to import the subproject they should be able to do something like:

subproject('libgcc', build_for: 'target')
subproject('libatomic', build_for: 'target')

Then Meson duplicates the project, substituting the library's host with every target we have.

Yes, multi-target, one host, is more useful than multi-host since there's more to reuse (the compiler itself vs maybe a few odds-and -ends config files), but fundamentally it's the same duplicate-and-substitute approach for each, so multi-host can be done with virtually no extra implementation cost.

@jpakkane
Copy link
Member

When you say multilib do you mean:

  1. A single compiler that can output files to any other platform just by adding a compiler argument (IIRC this is how Clang works)
  2. A single compiler that can output files to one specific other platform and to support many platforms you need to compile the compiler with specific options (IIRC this is how gcc & binutils work)

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 22, 2018

"multilib" is GCC jargon. See https://gcc.gnu.org/install/configure.html's description of --with-multilib-list=list. Basically it's GCC trying to be Clang. GCC doesn't strictly speaking, support a single exact platform per build-time configure, but family of platforms. For example the x86 and x86_64 back-ends are simultaneously built, as ones for all the ARM ISAs (thumb etc), and the 3 RISC-V ISAs.

More generally (to me) it's an admission that building a compiler with multiple backends is only part of the battle. There's still the runtime libraries needed to build programs relying on the full breadth C/C++ standards. Even in LLVM land, they abuse CMake in all sorts of ways so compiler-rt can be built for each of the enabled LLVM backends, rather than properly putting that logic in the CMake equivalent of subproject(..) where it belongs.

@jpakkane
Copy link
Member

Does that mean that if you need to support multiple targets then you'd need to compile, say, libatomic multiple times to get the final result?

@Ericson2314
Copy link
Member Author

Yes. And even for regular projects, something like simultaneous --enable-shared + --enable-static with libtool is conceptually the same thing, but implemented in an ad-hoc way.

@nirbheek
Copy link
Member

Does that mean that if you need to support multiple targets then you'd need to compile, say, libatomic multiple times to get the final result?

Note that this is required for implementing the old feature request we had for being able to build the same executable twice: once for native (for running now) and once cross (for installing). Currently there is no way to do this cleanly.

@jpakkane
Copy link
Member

How does the final result get collated? Are there libfoobar_x86.so, libfoobar_riscv.so etc or do they all get merged into a single libfoobar_everything.so?

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 22, 2018

A final conceptual argument, not only is the width limit of a single host/target platform arbitrary, the depth limit of just build, host, and target is also arbitrary.

Imagine we used https://github.com/mozilla/sccache or similar to distribute a single Ninja build over a heterogeneous build farm. Imagine also we added a for_build so one can do

subproject('foo', build_on: 'host', build_for: 'target')

which would build foo for target on machines running 'host` (as scheduled by scache). Now we can do arbitrary bootstrapping of new platforms with:

subproject('clang')
subproject('compiler_rt', build_on: 'host', build_for: 'target')
subproject('clang', build_on: 'host', build_for: 'target')
subproject('compiler_rt', build_on: 'target', build_for: 'super-target')
subproject('clang', build_on: 'target', build_for: 'super-target')
subproject('compiler_rt', build_on: 'super-target', build_for: 'super-super-target')
subproject('clang', build_on: 'super-target', build_for: 'super-super-target')
...

This may sound far-fetched, but this sort of thing is exactly what we do on the project-level rather than ninja-rule level with my distro's package set https://github.com/nixos/nixpkgs/.

If you take every machine as a node, and build_on->build_for pair as an edge, you get a corse-grained bootstrapping dependency of sorts. A machine file per node works just fine, and so we can use the same machine file thoughout all these generalizations, vs having to constantly re-engineer the cross file ([super_super_super...target_binaries]? eww). This leads me to side with machine files as the right abstraction that will stand the test of time.

@Ericson2314
Copy link
Member Author

@jpakkane they are not combined into "fat libs" but go in different subdirectories of lib. This is true for GCC's libs and compiler-rt.

@jpakkane
Copy link
Member

So is the total flow like this:

  1. Build a multiplatform-aware GCC
  2. Build internal libs once for each platform
  3. Install them all in the same location

And if yes, does step 2 use the compiler built in step 1 or the compiler that was used to build it (the "system" compiler)?

@Ericson2314
Copy link
Member Author

Ericson2314 commented Aug 22, 2018

The newly-built GCC is used to built the runtime libs where possible, in case they depend on new compiler features. But when the newly built GCC has host != build, it cannot be run, so CC_FOR_TARGET is used instead. For Meson, that would be the C compiler specified in the target machine file.

@Ericson2314
Copy link
Member Author

@dcbaker I don't know if you ever saw this. Between our PRs this is all but implemented. The "build file" is exactly your "native file" (due to the way defaulting works), but I still like splitting the "cross file" into host and target files.

@marc-h38
Copy link
Contributor

marc-h38 commented Dec 7, 2019

Now that:

Since meson 0.52.0 it is possible to layer cross files together. This works like native file layering: the purpose is to compose cross files together, and values from the second cross file will replace those from the first.

... would the following make sense?

[build.properties]

[host.properties]

etc.

These sections could be in the same --cross-file or not. Spoilt with choice.

@jpakkane
Copy link
Member

jpakkane commented Dec 7, 2019

The cross file must not define anything about the build machine!

@Ericson2314
Copy link
Member Author

@marc-h38 we already have the native file. Now it just remains to make them take all the same sections, by adding some to the native file, and probably deprecating [properties] as it's rather free-form.

@marc-h38
Copy link
Contributor

marc-h38 commented Dec 9, 2019

Thanks @Ericson2314

we already have the native file. Now it just remains to make them take all the same sections, by adding some to the native file,...

As of its last, August 2018 edit, the description suggests three new --xxxx-fileoptions. Quoting it:

 [--build-file m0.txt ] [ --host-file m1.txt ] [ --target-file m2.txt ]   | --cross-file cf.txt

Are you still recommending some new options today? If yes which ones? Either way would you mind updating the description and/or closing this issue with some summary why? I'm lost sorry. AFAIK we have only two now: --native-file and --cross-file. The only thing I'm aware has changed recently is the ability to use them multiple times.

@Ericson2314
Copy link
Member Author

I just opened #6322. If we can do that, I'll just close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants