Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap linux mixes crt1 from bootstrap libc and libc.so.6 from new libc, sometimes fails linking #158042

Closed
trofi opened this issue Feb 3, 2022 · 5 comments
Labels
0.kind: bug Something is broken 6.topic: stdenv Standard environment

Comments

@trofi
Copy link
Contributor

trofi commented Feb 3, 2022

The best way to show the problem is to try to build current nixpkgs against glibc.2.35, patch:

It fails to build simplest program as:

$ nix build -f. lv -L --max-jobs 1
expand-response-params> unpacking sources
expand-response-params> patching sources
expand-response-params> configuring
expand-response-params> no configure script, doing nothing
expand-response-params> building
expand-response-params> /nix/store/w2x0imn84ckw6y83p72rigj3h5ymmwll-binutils-2.35.2/bin/ld: /nix/store/p4s4jf7aq6v6z9iazll1aiqwb34aqxq9-bootstrap-tools/lib/crt1.o: in function `_start':
expand-response-params> /build/glibc-2.27/csu/../sysdeps/x86_64/start.S:101: undefined reference to `__libc_csu_fini'
expand-response-params> /nix/store/w2x0imn84ckw6y83p72rigj3h5ymmwll-binutils-2.35.2/bin/ld: /build/glibc-2.27/csu/../sysdeps/x86_64/start.S:102: undefined reference to `__libc_csu_init'
expand-response-params> collect2: error: ld returned 1 exit status

If I add a bit of debugging it becomes more clear where mismatch comes from:

--- a/pkgs/build-support/expand-response-params/default.nix
+++ b/pkgs/build-support/expand-response-params/default.nix
@@ -10,7 +10,8 @@ stdenv.mkDerivation {
     src=$PWD
   '';
   buildPhase = ''
-    NIX_CC_USE_RESPONSE_FILE=0 "$CC" -std=c99 -O3 -o "expand-response-params" expand-response-params.c
+    export NIX_DEBUG=1
+    NIX_CC_USE_RESPONSE_FILE=0 "$CC" -std=c99 -O3 -o "expand-response-params" expand-response-params.c -Wl,--verbose
   '';
   installPhase = ''
     mkdir -p $prefix/bin
$ nb --max-jobs 1 -L lv |& grep -P '(crt1|libc[.]so).*succeed'
expand-response-params> attempt to open /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/crt1.o succeeded
expand-response-params> attempt to open /nix/store/bf7l4af8rrfy1s17af4j2az0khlp5rw0-glibc-2.35/lib/libc.so succeeded
expand-response-params> attempt to open /nix/store/bf7l4af8rrfy1s17af4j2az0khlp5rw0-glibc-2.35/lib/libc.so.6 succeeded

My theory is that it was not visible before because until glibc-2.35 crt1.o did not change much. But upstream https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=035c012e32c11e84d64905efaf55e74f704d3668 change made it visible.

@trofi trofi added the 0.kind: bug Something is broken label Feb 3, 2022
@trofi
Copy link
Contributor Author

trofi commented Feb 3, 2022

I think it happens mechanically because -B and -L options go in wrong order:

  • -B goes "old libc", then "new new libc" (affects crt1.o lookup)
  • -L goes "new libc", then "old libc" (affects libc.so lookup)
expand-response-params> extra flags before to /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/bin/gcc:
expand-response-params>   -O2
expand-response-params>   -D_FORTIFY_SOURCE=2
expand-response-params>   -fstack-protector-strong
expand-response-params>   --param
expand-response-params>   ssp-buffer-size=4
expand-response-params>   -fno-strict-overflow
expand-response-params>   -Wformat
expand-response-params>   -Wformat-security
expand-response-params>   -Werror=format-security
expand-response-params>   -fPIC
expand-response-params>   -Wl\,-dynamic-linker=/nix/store/bf7l4af8rrfy1s17af4j2az0khlp5rw0-glibc-2.35/lib/ld-linux-x86-64.so.2
expand-response-params> original flags to /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/bin/gcc:
expand-response-params>   -std=c99
expand-response-params>   -O3
expand-response-params>   -o
expand-response-params>   expand-response-params
expand-response-params>   expand-response-params.c
expand-response-params>   -Wl\,--verbose
expand-response-params> extra flags after to /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/bin/gcc:
expand-response-params>   -B/nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib
expand-response-params>   -B/nix/store/bf7l4af8rrfy1s17af4j2az0khlp5rw0-glibc-2.35/lib/
expand-response-params>   -idirafter
expand-response-params>   /nix/store/3cmdhd36l2cqm484l381hz8z0nmq35fy-glibc-2.35-dev/include
expand-response-params>   -idirafter
expand-response-params>   /nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib/gcc/x86_64-unknown-linux-gnu/8.3.0/include-fixed
expand-response-params>   -B/nix/store/pq1sjlnlg3jx7jcnlyjmazhm3in5nh5f-bootstrap-stage3-gcc-wrapper-/bin/
expand-response-params>   -frandom-seed=w18da4hqv9
expand-response-params>   -Wl\,-rpath
expand-response-params>   -Wl\,/nix/store/w18da4hqv9nlcxn0pyvmx58ilgdy196p-expand-response-params/lib64
expand-response-params>   -Wl\,-rpath
expand-response-params>   -Wl\,/nix/store/w18da4hqv9nlcxn0pyvmx58ilgdy196p-expand-response-params/lib
expand-response-params>   -L/nix/store/bf7l4af8rrfy1s17af4j2az0khlp5rw0-glibc-2.35/lib
expand-response-params>   -L/nix/store/i3ibpx67yncp4w4mpkf5pwvjjsd0aqln-bootstrap-tools/lib

@trofi
Copy link
Contributor Author

trofi commented Feb 3, 2022

Proposed possible fix: #158047 (breaks more than fixes)

@veprbl veprbl added the 6.topic: stdenv Standard environment label Feb 3, 2022
@trofi
Copy link
Contributor Author

trofi commented Feb 11, 2022

How I understand the breakage:

  1. WRT libraries stdenv consists of libc-so+crt, cc-libs+crt, gmp-mpc-mpfr-libs-so components (and search paths to them)
  2. Initial stdenv consists of prebuild bootstrapTools for all 3 components.
  3. There are a few intermediate stdenv states during bootstrap to gradually switch from bootstrapTools prebuilts to nixpkgs-provided packages:
    ({}: {
    __raw = true;
    gcc-unwrapped = null;
    binutils = null;
    coreutils = null;
    gnugrep = null;
    })
    # Build a dummy stdenv with no GCC or working fetchurl. This is
    # because we need a stdenv to build the GCC wrapper and fetchurl.
    (prevStage: stageFun prevStage {
    name = "bootstrap-stage0";
    overrides = self: super: {
    # We thread stage0's stdenv through under this name so downstream stages
    # can use it for wrapping gcc too. This way, downstream stages don't need
    # to refer to this stage directly, which violates the principle that each
    # stage should only access the stage that came before it.
    ccWrapperStdenv = self.stdenv;
    # The Glibc include directory cannot have the same prefix as the
    # GCC include directory, since GCC gets confused otherwise (it
    # will search the Glibc headers before the GCC headers). So
    # create a dummy Glibc here, which will be used in the stdenv of
    # stage1.
    ${localSystem.libc} = self.stdenv.mkDerivation {
    pname = "bootstrap-stage0-${localSystem.libc}";
    version = "bootstrap";
    buildCommand = ''
    mkdir -p $out
    ln -s ${bootstrapTools}/lib $out/lib
    '' + lib.optionalString (localSystem.libc == "glibc") ''
    ln -s ${bootstrapTools}/include-glibc $out/include
    '' + lib.optionalString (localSystem.libc == "musl") ''
    ln -s ${bootstrapTools}/include-libc $out/include
    '';
    };
    gcc-unwrapped = bootstrapTools;
    binutils = import ../../build-support/bintools-wrapper {
    name = "bootstrap-stage0-binutils-wrapper";
    nativeTools = false;
    nativeLibc = false;
    buildPackages = { };
    libc = getLibc self;
    inherit lib;
    inherit (self) stdenvNoCC coreutils gnugrep;
    bintools = bootstrapTools;
    };
    coreutils = bootstrapTools;
    gnugrep = bootstrapTools;
    };
    })
    # Create the first "real" standard environment. This one consists
    # of bootstrap tools only, and a minimal Glibc to keep the GCC
    # configure script happy.
    #
    # For clarity, we only use the previous stage when specifying these
    # stages. So stageN should only ever have references for stage{N-1}.
    #
    # If we ever need to use a package from more than one stage back, we
    # simply re-export those packages in the middle stage(s) using the
    # overrides attribute and the inherit syntax.
    (prevStage: stageFun prevStage {
    name = "bootstrap-stage1";
    # Rebuild binutils to use from stage2 onwards.
    overrides = self: super: {
    binutils-unwrapped = super.binutils-unwrapped.override {
    gold = false;
    };
    inherit (prevStage)
    ccWrapperStdenv
    gcc-unwrapped coreutils gnugrep;
    ${localSystem.libc} = getLibc prevStage;
    # A threaded perl build needs glibc/libpthread_nonshared.a,
    # which is not included in bootstrapTools, so disable threading.
    # This is not an issue for the final stdenv, because this perl
    # won't be included in the final stdenv and won't be exported to
    # top-level pkgs as an override either.
    perl = super.perl.override { enableThreading = false; };
    };
    })
    # 2nd stdenv that contains our own rebuilt binutils and is used for
    # compiling our own Glibc.
    (prevStage: stageFun prevStage {
    name = "bootstrap-stage2";
    overrides = self: super: {
    inherit (prevStage)
    ccWrapperStdenv
    gcc-unwrapped coreutils gnugrep
    perl gnum4 bison;
    dejagnu = super.dejagnu.overrideAttrs (a: { doCheck = false; } );
    # We need libidn2 and its dependency libunistring as glibc dependency.
    # To avoid the cycle, we build against bootstrap libc, nuke references,
    # and use the result as input for our final glibc. We also pass this pair
    # through, so the final package-set uses exactly the same builds.
    libunistring = super.libunistring.overrideAttrs (attrs: {
    postFixup = attrs.postFixup or "" + ''
    ${self.nukeReferences}/bin/nuke-refs "$out"/lib/lib*.so.*.*
    '';
    # Apparently iconv won't work with bootstrap glibc, but it will be used
    # with glibc built later where we keep *this* build of libunistring,
    # so we need to trick it into supporting libiconv.
    am_cv_func_iconv_works = "yes";
    });
    libidn2 = super.libidn2.overrideAttrs (attrs: {
    postFixup = attrs.postFixup or "" + ''
    ${self.nukeReferences}/bin/nuke-refs -e '${lib.getLib self.libunistring}' \
    "$out"/lib/lib*.so.*.*
    '';
    });
    # This also contains the full, dynamically linked, final Glibc.
    binutils = prevStage.binutils.override {
    # Rewrap the binutils with the new glibc, so both the next
    # stage's wrappers use it.
    libc = getLibc self;
    # Unfortunately, when building gcc in the next stage, its LTO plugin
    # would use the final libc but `ld` would use the bootstrap one,
    # and that can fail to load. Therefore we upgrade `ld` to use newer libc;
    # apparently the interpreter needs to match libc, too.
    bintools = self.stdenvNoCC.mkDerivation {
    inherit (prevStage.bintools.bintools) name;
    dontUnpack = true;
    dontBuild = true;
    # We wouldn't need to *copy* all, but it's easier and the result is temporary anyway.
    installPhase = ''
    mkdir -p "$out"/bin
    cp -a '${prevStage.bintools.bintools}'/bin/* "$out"/bin/
    chmod +w "$out"/bin/ld.bfd
    patchelf --set-interpreter '${getLibc self}'/lib/ld*.so.? \
    --set-rpath "${getLibc self}/lib:$(patchelf --print-rpath "$out"/bin/ld.bfd)" \
    "$out"/bin/ld.bfd
    '';
    };
    };
    };
    # `libtool` comes with obsolete config.sub/config.guess that don't recognize Risc-V.
    extraNativeBuildInputs =
    lib.optional (localSystem.isRiscV) prevStage.updateAutotoolsGnuConfigScriptsHook;
    })
    # Construct a third stdenv identical to the 2nd, except that this
    # one uses the rebuilt Glibc from stage2. It still uses the recent
    # binutils and rest of the bootstrap tools, including GCC.
    (prevStage: stageFun prevStage {
    name = "bootstrap-stage3";
    overrides = self: super: rec {
    inherit (prevStage)
    ccWrapperStdenv
    binutils coreutils gnugrep
    perl patchelf linuxHeaders gnum4 bison libidn2 libunistring;
    ${localSystem.libc} = getLibc prevStage;
    # Link GCC statically against GMP etc. This makes sense because
    # these builds of the libraries are only used by GCC, so it
    # reduces the size of the stdenv closure.
    gmp = super.gmp.override { stdenv = self.makeStaticLibraries self.stdenv; };
    mpfr = super.mpfr.override { stdenv = self.makeStaticLibraries self.stdenv; };
    libmpc = super.libmpc.override { stdenv = self.makeStaticLibraries self.stdenv; };
    isl_0_20 = super.isl_0_20.override { stdenv = self.makeStaticLibraries self.stdenv; };
    gcc-unwrapped = super.gcc-unwrapped.override {
    isl = isl_0_20;
    # Use a deterministically built compiler
    # see https://github.com/NixOS/nixpkgs/issues/108475 for context
    reproducibleBuild = true;
    profiledCompiler = false;
    };
    };
    extraNativeBuildInputs = [ prevStage.patchelf ] ++
    # Many tarballs come with obsolete config.sub/config.guess that don't recognize aarch64.
    lib.optional (!localSystem.isx86 || localSystem.libc == "musl")
    prevStage.updateAutotoolsGnuConfigScriptsHook;
    })
    # Construct a fourth stdenv that uses the new GCC. But coreutils is
    # still from the bootstrap tools.
    (prevStage: stageFun prevStage {
    name = "bootstrap-stage4";
    overrides = self: super: {
    # Zlib has to be inherited and not rebuilt in this stage,
    # because gcc (since JAR support) already depends on zlib, and
    # then if we already have a zlib we want to use that for the
    # other purposes (binutils and top-level pkgs) too.
    inherit (prevStage) gettext gnum4 bison gmp perl texinfo zlib linuxHeaders libidn2 libunistring;
    ${localSystem.libc} = getLibc prevStage;
    binutils = super.binutils.override {
    # Don't use stdenv's shell but our own
    shell = self.bash + "/bin/bash";
    # Build expand-response-params with last stage like below
    buildPackages = {
    inherit (prevStage) stdenv;
    };
    };
    gcc = lib.makeOverridable (import ../../build-support/cc-wrapper) {
    nativeTools = false;
    nativeLibc = false;
    isGNU = true;
    buildPackages = {
    inherit (prevStage) stdenv;
    };
    cc = prevStage.gcc-unwrapped;
    bintools = self.binutils;
    libc = getLibc self;
    inherit lib;
    inherit (self) stdenvNoCC coreutils gnugrep;
    shell = self.bash + "/bin/bash";
    };
    };
    extraNativeBuildInputs = [ prevStage.patchelf prevStage.xz ] ++
    # Many tarballs come with obsolete config.sub/config.guess that don't recognize aarch64.
    lib.optional (!localSystem.isx86 || localSystem.libc == "musl")
    prevStage.updateAutotoolsGnuConfigScriptsHook;
    })
    # Construct the final stdenv. It uses the Glibc and GCC, and adds
    # in a new binutils that doesn't depend on bootstrap-tools, as well
    # as dynamically linked versions of all other tools.
    #
    # When updating stdenvLinux, make sure that the result has no
    # dependency (`nix-store -qR') on bootstrapTools or the first
    # binutils built.
    (prevStage: {
    inherit config overlays;
    stdenv = import ../generic rec {
    name = "stdenv-linux";
    buildPlatform = localSystem;
    hostPlatform = localSystem;
    targetPlatform = localSystem;
    inherit config;
    preHook = commonPreHook;
    initialPath =
    ((import ../common-path.nix) {pkgs = prevStage;});
    extraNativeBuildInputs = [ prevStage.patchelf ] ++
    # Many tarballs come with obsolete config.sub/config.guess that don't recognize aarch64.
    lib.optional (!localSystem.isx86 || localSystem.libc == "musl")
    prevStage.updateAutotoolsGnuConfigScriptsHook;
    cc = prevStage.gcc;
    shell = cc.shell;
    inherit (prevStage.stdenv) fetchurlBoot;
    extraAttrs = {
    # TODO: remove this!
    inherit (prevStage) glibc;
    inherit bootstrapTools;
    shellPackage = prevStage.bash;
    };
    # Mainly avoid reference to bootstrap tools
    allowedRequisites = with prevStage; with lib;
    # Simple executable tools
    concatMap (p: [ (getBin p) (getLib p) ]) [
    gzip bzip2 xz bash binutils.bintools coreutils diffutils findutils
    gawk gnumake gnused gnutar gnugrep gnupatch patchelf ed
    ]
    # Library dependencies
    ++ map getLib (
    [ attr acl zlib pcre libidn2 libunistring ]
    ++ lib.optional (gawk.libsigsegv != null) gawk.libsigsegv
    )
    # More complicated cases
    ++ (map (x: getOutput x (getLibc prevStage)) [ "out" "dev" "bin" ] )
    ++ [ /*propagated from .dev*/ linuxHeaders
    binutils gcc gcc.cc gcc.cc.lib gcc.expand-response-params
    ]
    ++ lib.optionals (!localSystem.isx86 || localSystem.libc == "musl")
    [ prevStage.updateAutotoolsGnuConfigScriptsHook prevStage.gnu-config ];
    overrides = self: super: {
    inherit (prevStage)
    gzip bzip2 xz bash coreutils diffutils findutils gawk
    gnumake gnused gnutar gnugrep gnupatch patchelf
  4. each next stdenv stage tries to change only one aspect of cc-libs/libc-so/cc-libs/gmp-mpc-mpfr-libs-so by tweaking -B/-L/-rpath flags.

The problem is that initial bootstrapTools archive has both libc-so+crt and cc-libs conflated into ${bootstrapTools}/lib. This makes it hard to swap just libc-so+crt out if ${bootstrapTools}/lib is already the first in the -L/-B search path: it always pulls both libc-so+crt and cc-libs+crt.

My understanding of the bootstrap sequence:

  • stage0: use libc-so+crt, cc-libs+crt, other-libs-so from bootstrap
  • stage2: build libc-so+crt
  • stage3: build gmp-mpc-mpfr-libs-so
  • stage4: build cc-libs+crt

I think if we were to unconflate all 3 paths early by splitting bootstrapTools into 3 sets of directories it would be easier to reason about the swapping effect of the components.

Otherwise we can try to fix and maintain -L and -B order to always follow:

  • -L/-B libc-so+crt
  • -L/-B gmp-mpc-mpfr-libs-so
  • -L/-B cc-libs+crt
    It sounds fragile to maintain across all the nix-support/libc-cflags / nix-support/libc-crt1-cflags / nix-support/cc-cflags-before

@trofi
Copy link
Contributor Author

trofi commented Feb 11, 2022

Fixed ordering with proposed #158047. It does not fix glibc-2.35 yet, but I think it's an improvement.

@trofi
Copy link
Contributor Author

trofi commented Feb 11, 2022

Fixed ordering with proposed #158047. It does not fix glibc-2.35 yet, but I think it's an improvement.

Found the minor bug in my glibc-2.35 update. With v2 the fix is enough to get nixpkgs bootstrapped with #158047.

trofi added a commit to trofi/nixpkgs that referenced this issue Feb 20, 2022
In NixOS#158042 I noticed order
mismatch as a bootstrap build failure when building x86_64-linux
against glibc-2.35 in nixpkgs (bootstrap libs has glibc-2.27):

    expand-response-params> ld: /nix/store/p4s4jf7aq6v6z9iazll1aiqwb34aqxq9-bootstrap-tools/lib/crt1.o: in function `_start':
    expand-response-params> /build/glibc-2.27/csu/../sysdeps/x86_64/start.S:101: undefined reference to `__libc_csu_fini'
    expand-response-params> ld: /build/glibc-2.27/csu/../sysdeps/x86_64/start.S:102: undefined reference to `__libc_csu_init'
    expand-response-params> collect2: error: ld returned 1 exit status

Here crt1.o from glibc-2.27 links against libc.so.6 from glibc-2.35.

This happens because ordering of `-L` (influences `libc.so` lookup) and
`-B` (influences `crt1.o` lookup) flags differs:

    expand-response-params>   -B/...-bootstrap-tools/lib
    expand-response-params>   -B/...-glibc-2.35/lib/
    ...
    expand-response-params>   -L/...-glibc-2.35/lib
    expand-response-params>   -L/...-bootstrap-tools/lib

The change makes consistent ordering of `-L`/`-B` and allows getting to
stage4 for `glibc-2.35` target.
gentoo-bot pushed a commit to gentoo/prefix that referenced this issue Sep 3, 2022
slyfox explains it well in the linked bugs, but
the gist is that we mostly got lucky for a while.

We would mix system crt*.o with just-built libc_nonshared.a/libc.so
which led to issues like:
```
configure:4383: x86_64-pc-linux-gnu-gcc -O2 -pipe -O2 -pipe  -L/home/share/gentoo/usr/lib64 -Wl,--dynamic-linker=/home/share/gentoo/lib64/ld-linux-x86-64.so.2 conftest.c  >&5
/home/share/gentoo/tmp/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: /lib/../lib64/Scrt1.o: in function `_start':
(.text+0x16): undefined reference to `__libc_csu_fini'
/home/share/gentoo/tmp/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: (.text+0x1d): undefined reference to `__libc_csu_init'
```

We need to force GCC to use the Prefix version
of glibc we just built.

Bug: NixOS/nixpkgs#158042
Closes: https://bugs.gentoo.org/824482
Thanks-to: Bart Oldeman <[email protected]>
Thanks-to: Sergei Trofimovich <[email protected]>
Signed-off-by: Sam James <[email protected]>
@trofi trofi closed this as completed Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: stdenv Standard environment
Projects
None yet
Development

No branches or pull requests

2 participants