Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHC-8.6.5 bootstrap binary installation segfault on musl #118731

Closed
nomeata opened this issue Apr 7, 2021 · 9 comments
Closed

GHC-8.6.5 bootstrap binary installation segfault on musl #118731

nomeata opened this issue Apr 7, 2021 · 9 comments

Comments

@nomeata
Copy link
Contributor

nomeata commented Apr 7, 2021

Issue description

On todays master (229aff8), I observe this:

$ nix-build -A pkgsMusl.buildPackages.haskell.compiler.ghc865Binary setfault
unpacking sources
unpacking source archive /nix/store/43n0mrmxkv86baxzisc986x36c2v3sjy-ghc-8.6.5-x86_64-fedora27-linux.tar.xz
source root is ghc-8.6.5
patching script interpreter paths in ghc-8.6.5/utils/
ghc-8.6.5/utils/ghc-cabal/dist-install/build/tmp/ghc-cabal-bindist: interpreter directive changed from "#!/bin/sh" to "/nix/store/8z553rdkg2h2892snrg6fakw354p3bxn-bash-4.4-p23/bin/sh"
…
Patchelfing /nix/store/iz8azixwl6vzzd87q1p38vlsm3hb21aa-ghc-8.6.5-binary/lib/ghc-8.6.5/bin/hp2ps
Patchelfing /nix/store/iz8azixwl6vzzd87q1p38vlsm3hb21aa-ghc-8.6.5-binary/lib/ghc-8.6.5/bin/ghc-iserv
Patchelfing /nix/store/iz8azixwl6vzzd87q1p38vlsm3hb21aa-ghc-8.6.5-binary/lib/ghc-8.6.5/bin/runghc
Patchelfing /nix/store/iz8azixwl6vzzd87q1p38vlsm3hb21aa-ghc-8.6.5-binary/lib/ghc-8.6.5/ghc-boot-th-8.6.5/libHSghc-boot-th-8.6.5-ghc8.6.5.so
running install tests
/nix/store/6my73ym207a4ds2szcl9wnr1gby10q56-stdenv-linux/setup: line 1318:  7439 Segmentation fault      (core dumped) $out/bin/ghc --make main.hs
builder for '/nix/store/00x82zgp94rf4682gzq63d2nlldzvfmb-ghc-8.6.5-binary.drv' failed with exit code 1

This smelled a bit like the issue fixed in #103183, but, well, that fix is already included in master.

@cdepillabout
Copy link
Member

@nomeata Thanks for reporting this.

As you're probably aware, I don't think any of the Haskell maintainers here in Nixpkgs really check up to make sure the musl stuff is always working.

I'd be happy to merge a fix for this if you (or anyone else) can figure out a solution to this problem.

@nh2
Copy link
Contributor

nh2 commented Jul 2, 2021

I have bisected this as it is a blocker for static-haskell-nix, see #43795 (comment).

Command:

NIX_PATH=nixpkgs=. nix-build --no-link -A pkgsMusl.haskell.compiler.ghc865Binary

There's another issue that hides the segfault in some cases (resulting in recompile with -fPIE instead), which I have also bisected and will soon do a writeup for; to work around that for this bisection, you need to apply this patch:

--- a/pkgs/development/compilers/ghc/8.6.5-binary.nix
+++ b/pkgs/development/compilers/ghc/8.6.5-binary.nix
@@ -150,6 +150,8 @@ stdenv.mkDerivation rec {
     done
   '';
 
+  hardeningDisable = builtins.trace "ghc targetPlatform.isMusl = ${toString stdenv.targetPlatform.isMusl}" lib.optional stdenv.targetPlatform.isMusl "pie";
+
   doInstallCheck = true;
   installCheckPhase = ''
     unset ${libEnvVar}

With that, I found the bad commit: 5e2311d, a musl upgrade from PR #117375:

5e2311d2fb2e375a304a9d6d38b6fa445f158a75 is the first bad commit
commit 5e2311d2fb2e375a304a9d6d38b6fa445f158a75
Author: TredwellGit <[email protected]>
Date:   Tue Mar 23 15:45:23 2021 +0000

    musl: 1.2.1 -> 1.2.2

    https://git.musl-libc.org/cgit/musl/tree/WHATSNEW?h=v1.2.2#n2242

 pkgs/os-specific/linux/musl/default.nix | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

Bisection log:

# git bisect log
git bisect start
# good: [6245758fd522c9dd14296a06f7a3ac25236e9222] Merge pull request #111257 from r-ryantm/auto-update/haproxy
git bisect good 6245758fd522c9dd14296a06f7a3ac25236e9222
# bad: [830ef6422f643d5c639fd79bca834c726787ec51] haskell-generic-builder: disable static PIE
git bisect bad 830ef6422f643d5c639fd79bca834c726787ec51
# bad: [1433dad051b601261d39de1bbb63dc030018efd7] Merge pull request #117797 from r-ryantm/auto-update/lazydocker
git bisect bad 1433dad051b601261d39de1bbb63dc030018efd7
# good: [c456a2512f7a7558cbe25328a423762033822cc0] Merge master into staging-next
git bisect good c456a2512f7a7558cbe25328a423762033822cc0
# good: [8ecf143cb0cfe93cb50337865f5dbceceeee0d0a] Merge pull request #113222 from Pacman99/fix-notification-center
git bisect good 8ecf143cb0cfe93cb50337865f5dbceceeee0d0a
# good: [8a945c941e0b04919edb0b1b2673d67390fdf566] Merge pull request #116994 from AndersonTorres/new-yabasic
git bisect good 8a945c941e0b04919edb0b1b2673d67390fdf566
# good: [70b387da701255c1f9a166ca7f97231cbcb087e0] Merge pull request #117340 from fabaff/bump-metasploit
git bisect good 70b387da701255c1f9a166ca7f97231cbcb087e0
# bad: [350f9bd822e1c5017482e6364dbafda36ff4157a] Merge pull request #117570 from FRidh/python2alias
git bisect bad 350f9bd822e1c5017482e6364dbafda36ff4157a
# bad: [c1cd574165cf254cc230d703802bafe7eb22d727] Merge pull request #117498 from rmcgibbo/cherrypy
git bisect bad c1cd574165cf254cc230d703802bafe7eb22d727
# bad: [070bfc96b7446e7d2653a1913b9bc4082d529638] python38Packages.xdis: 5.0.5 -> 5.0.8 (#117364)
git bisect bad 070bfc96b7446e7d2653a1913b9bc4082d529638
# bad: [d969cf2f42e3eb300a26d2bfe6bf4fb908230fd2] Merge pull request #117387 from AndersonTorres/new-zziplib
git bisect bad d969cf2f42e3eb300a26d2bfe6bf4fb908230fd2
# good: [5e5ae827f586175b2bb5954a531f84dc3e0856c0] Merge pull request #117367 from rmcgibbo/sklearn-deap
git bisect good 5e5ae827f586175b2bb5954a531f84dc3e0856c0
# bad: [4d709f381abaefa1cc5616164b1b72178c2d1a69] Merge pull request #117048 from AndersonTorres/new-mksh
git bisect bad 4d709f381abaefa1cc5616164b1b72178c2d1a69
# good: [7e87c10a9891f8454e8b2c5df54b52215e399135] notcurses: 2.2.2 -> 2.2.3
git bisect good 7e87c10a9891f8454e8b2c5df54b52215e399135
# bad: [dfd5d237d95782f6dbde223751ccc01439fd68aa] links2: 2.21 -> 2.22
git bisect bad dfd5d237d95782f6dbde223751ccc01439fd68aa
# bad: [77f3022296b33624fce6856901d83c55538b0e30] Merge pull request #117375 from TredwellGit/musl
git bisect bad 77f3022296b33624fce6856901d83c55538b0e30
# bad: [5e2311d2fb2e375a304a9d6d38b6fa445f158a75] musl: 1.2.1 -> 1.2.2
git bisect bad 5e2311d2fb2e375a304a9d6d38b6fa445f158a75
# first bad commit: [5e2311d2fb2e375a304a9d6d38b6fa445f158a75] musl: 1.2.1 -> 1.2.2

@sternenseemann
Copy link
Member

I guess, we should report this upstream with (probably) musl — looks like a regression on their part. Other possibility is that GHC has unfair expectations.

@nh2
Copy link
Contributor

nh2 commented Jul 2, 2021

I am now bisecting the commits between musl 1.2.1 and 1.2.2, so that we can ask a more informed question upstream.

@nh2
Copy link
Contributor

nh2 commented Jul 3, 2021

I have finished bisecting musl. It suggests that this is the commit that introduces the segfault:

http://git.musl-libc.org/cgit/musl/commit/?id=57f6e85c9de417fef5eece2a5b00c1104321f543

remove redundant pthread struct members repeated for layout purposes

As written in the bisection notes below, the problematic commit is sandwiched between a commit that introduces a GCC compile failure, and its fix. This made bisecting on the upstream git history impossible. So I reordered the upstream history to squash the GCC compile fix into the compile-breaking commit, in my branch ghc-bin-segfault-bisect-musl-reordered-history.

With that, I did an automatic git bisect run with (cd ../../nixpkgs && nix-build --no-link -A pkgsMusl.haskell.compiler.ghc865Binary) on the musl repo with my branch, where in ../../nixpkgs I had the musl package derivation overridden to have src = /path/to/musl/checkout; so that I could build ghc with nix.

I still don't know why that commit would cause the segfault.

My bisection notes

  • Bisecting now on musl with commits between v1.2.1 and v1.2.2:
    • Command: git bisect run with (cd ../../nixpkgs && nix-build --no-link -A pkgsMusl.haskell.compiler.ghc865Binary)
    • Top terminal: 1.2.1 -- good
    • Result:
      4d5786544bb52c62fc1ae84d91684ef2268afa05 is the first bad commit
      commit 4d5786544bb52c62fc1ae84d91684ef2268afa05
      Author: Rich Felker <[email protected]>
      Date:   Mon Aug 24 12:29:30 2020 -0400
      
          add tcgetwinsize and tcsetwinsize functions, move struct winsize
      
          these have been adopted for future issue of POSIX as the outcome of
          Austin Group issue 1151, and are simply functions performing the roles
          of the historical ioctls. since struct winsize is being standardized
          along with them, its definition is moved to the appropriate header.
      
          there is some chance this will break source files that expect struct
          winsize to be defined by sys/ioctl.h without including termios.h. if
          this happens, further changes will be needed to have sys/ioctl.h
          expose it too.
      
      # git bisect log                                                                                                                       :(
      git bisect start
      # bad: [85e0e3519655220688e757b9d5bfd314923548bd] release 1.2.2
      git bisect bad 85e0e3519655220688e757b9d5bfd314923548bd
      # good: [73cc775bee53300c7cf759f37580220b18ac13d3] release 1.2.1
      git bisect good 73cc775bee53300c7cf759f37580220b18ac13d3
      # bad: [bd153422f28634bb6e53f13f80beb8289d405267] implement _Fork and refactor fork using it
      git bisect bad bd153422f28634bb6e53f13f80beb8289d405267
      # bad: [262003ad9d8894c03fa4b033140e1e14e4c24c4d] fix missing newline in herror output
      git bisect bad 262003ad9d8894c03fa4b033140e1e14e4c24c4d
      # good: [9d4b25b4738dbabf628055601d96ba0609c2b4a8] fix MUSL_LOCPATH search
      git bisect good 9d4b25b4738dbabf628055601d96ba0609c2b4a8 # done that
      # bad: [e7f808e3595ad3111edec57270bdc088f64a418b] configure: add further -Werror=... options to detected CFLAGS
      git bisect bad e7f808e3595ad3111edec57270bdc088f64a418b
      # bad: [19f8642494b7d27b2ceed5c14d4a0b27cb749afe] report res_query failures, including nxdomain/nodata, via h_errno
      git bisect bad 19f8642494b7d27b2ceed5c14d4a0b27cb749afe
      # bad: [9d0b8b92a508c328e7eac774847f001f80dfb5ff] make h_errno thread-local
      git bisect bad 9d0b8b92a508c328e7eac774847f001f80dfb5ff
      # bad: [4d5786544bb52c62fc1ae84d91684ef2268afa05] add tcgetwinsize and tcsetwinsize functions, move struct winsize
      git bisect bad 4d5786544bb52c62fc1ae84d91684ef2268afa05
      # first bad commit: [4d5786544bb52c62fc1ae84d91684ef2268afa05] add tcgetwinsize and tcsetwinsize functions, move struct winsize
      But there are gcc compile failures in between, so this is not a correct bisection.
      The only thing that the above bisection shows is that the commit add tcgetwinsize breaks GCC.
      I suspect it is fixed by this followup commit:
      commit 1ccc804e1345c6e59294f561ac43c3e55ccea1e4
      Author: Rich Felker <[email protected]>
      Date:   Sun Aug 30 16:47:40 2020 -0400
      
          fix regression with applications that expect struct winsize in ioctl.h
          
          putting the (simple) definition in alltypes.h seems like the best
          solution here. making sys/ioctl.h implicitly include termios.h is
          probably excess namespace pollution.
      
      # git bisect log                                                                                                                       :(
      git bisect start
      # bad: [85e0e3519655220688e757b9d5bfd314923548bd] release 1.2.2
      git bisect bad 85e0e3519655220688e757b9d5bfd314923548bd
      # good: [73cc775bee53300c7cf759f37580220b18ac13d3] release 1.2.1
      git bisect good 73cc775bee53300c7cf759f37580220b18ac13d3
      # bad: [bd153422f28634bb6e53f13f80beb8289d405267] implement _Fork and refactor fork using it
      git bisect bad bd153422f28634bb6e53f13f80beb8289d405267
      # bad: [262003ad9d8894c03fa4b033140e1e14e4c24c4d] fix missing newline in herror output
      git bisect bad 262003ad9d8894c03fa4b033140e1e14e4c24c4d
      # good: [9d4b25b4738dbabf628055601d96ba0609c2b4a8] fix MUSL_LOCPATH search
      git bisect good 9d4b25b4738dbabf628055601d96ba0609c2b4a8
      # skip: [e7f808e3595ad3111edec57270bdc088f64a418b] configure: add further -Werror=... options to detected CFLAGS
      git bisect skip e7f808e3595ad3111edec57270bdc088f64a418b
      # ^ due to:
      # ../../gcc-10.2.0/gcc/diagnostic.c: In function ‘int get_terminal_width()’:
      # ../../gcc-10.2.0/gcc/diagnostic.c:123:18: error: aggregate ‘get_terminal_width()::winsize w’ has incomplete type and cannot be defined
      #    struct winsize w;
      #                   ^
      # skip: [19f8642494b7d27b2ceed5c14d4a0b27cb749afe] report res_query failures, including nxdomain/nodata, via h_errno
      git bisect skip 19f8642494b7d27b2ceed5c14d4a0b27cb749afe
      # ^ due to: same gcc build failure
      # skip: [0a312d34b98940f6543b4ae07077d1d59d0afe5b] configure: use additive warnings instead of subtracting from -Wall
      git bisect skip 0a312d34b98940f6543b4ae07077d1d59d0afe5b
      # ^ due to: same gcc build failure
      # bad: [cab0a8fb8d9a1095de3e4c2227bfc37fae93f781] clean up overinclusion in files using TIOCGWINSZ
      git bisect bad cab0a8fb8d9a1095de3e4c2227bfc37fae93f781
      # skip: [3a5b9ae7cf656648c80fe155a5239d9b4fb4c485] deduplicate __pthread_self thread pointer adjustment out of each arch
      git bisect skip 3a5b9ae7cf656648c80fe155a5239d9b4fb4c485
      # ^ due to: same gcc build failure
      git bisect reset
      At this point I'm aboring this bisect because I've checked manually that every commit in the remaining range will have the GCC compile problem.
      Instead, change musl's history to fix GCC, and bisect on that.
      Narrower range for addressing musl->gcc miscompilation by reording the musl history, squashing fix for struct winsize into the change commit:
      bad:  262003ad9d8894c03fa4b033140e1e14e4c24c4d
      good: 9d4b25b4738dbabf628055601d96ba0609c2b4a8
      
      In dir ~/src/musl-bisect-1:
      git checkout -b ghc-bin-segfault-bisect-musl-reordered-history 262003ad9d8894c03fa4b033140e1e14e4c24c4d
      git rebase -i 9d4b25b4738dbabf628055601d96ba0609c2b4a8
      # pick 4d578654 add tcgetwinsize and tcsetwinsize functions, move struct winsize
      # squash 1ccc804e fix regression with applications that expect struct winsize in ioctl.h
      git checkout aa129861 # `add tcgetwinsize ...` with squashed `fix regression ... struct winsize ...`
      I pushed that branch here for others to reproduce: https://github.com/nh2/musl/commits/ghc-bin-segfault-bisect-musl-reordered-history
      • Check that the above checkout commit aa129861 does not segfault.
        • Yes, does not segfault.
      • Check that ghc-bin-segfault-bisect-musl-reordered-history (commit d1cb169b) segfaults.
        • Yes, does segfault.
          Bisect that range:
      01c7920f05b62eb41d1acc325e5ba326c435da4c is the first bad commit
      commit 01c7920f05b62eb41d1acc325e5ba326c435da4c
      Author: Rich Felker <[email protected]>
      Date:   Mon Aug 24 22:45:51 2020 -0400
      
          remove redundant pthread struct members repeated for layout purposes
      
          dtv_copy, canary2, and canary_at_end existed solely to match multiple
          ABI and asm-accessed layouts simultaneously. now that pthread_arch.h
          can be included before struct __pthread is defined, the struct layout
          can depend on macros defined by pthread_arch.h.
      
      # git bisect log
      git bisect start
      # good: [aa129861dc72514cf19041408e5c3d415036b7f5] add tcgetwinsize and tcsetwinsize functions, move struct winsize
      git bisect good aa129861dc72514cf19041408e5c3d415036b7f5
      # bad: [d1cb169bba96de30d0748ec3380cdff42269fe04] fix missing newline in herror output
      git bisect bad d1cb169bba96de30d0748ec3380cdff42269fe04
      # bad: [93574eb5dd2a1f1f24f4ca04be481446ad3505dd] configure: add further -Werror=... options to detected CFLAGS
      git bisect bad 93574eb5dd2a1f1f24f4ca04be481446ad3505dd
      # good: [ea72959b0cb8cab5e7cf6b850fcfc79d86ca29db] deduplicate TP_ADJ logic out of each arch, replace with TP_OFFSET
      git bisect good ea72959b0cb8cab5e7cf6b850fcfc79d86ca29db
      # bad: [01c7920f05b62eb41d1acc325e5ba326c435da4c] remove redundant pthread struct members repeated for layout purposes
      git bisect bad 01c7920f05b62eb41d1acc325e5ba326c435da4c
      # good: [ef1591337091e0d4391ad7144a5c9fbb0f0cfde2] deduplicate __pthread_self thread pointer adjustment out of each arch
      git bisect good ef1591337091e0d4391ad7144a5c9fbb0f0cfde2
      # first bad commit: [01c7920f05b62eb41d1acc325e5ba326c435da4c] remove redundant pthread struct members repeated for layout purposes

nh2 added a commit to nh2/nixpkgs that referenced this issue Jul 5, 2021
…compiler.

This addresses the fact that `ghc865Binary` segfaults on musl
(see NixOS#118731) because of the glibc+musl mix used in there.

With the previous commits, `ghc8102Binary` was changed to use
the musl-based bindist from GHC HQ instead, which works.

With this change, all nix Haskell compilers builds on musl:

    NIX_PATH=nixpkgs=. nix-build --no-link --expr 'with import <nixpkgs> {}; { inherit (pkgsMusl.haskell.compiler) ghc884 ghc8104 ghc901 ghcHEAD; }'
@nh2
Copy link
Contributor

nh2 commented Jul 5, 2021

PR at #129289.

nh2 added a commit to nh2/nixpkgs that referenced this issue Jul 5, 2021
…OS#118731 NixOS#129247.

This commit replaces the musl + glibc hackery in the GHC bindist
compiler by using the new musl based bindist that GHC HQ provides
(built on Alpine).
We could alternatively also use a nix-built musl boostrap compiler,
but it seems nicer to use the GHC HQ one for now.

This fixes the compiler built by
`pkgsMusl.haskell.compiler.ghc8102Binary` segfaulting (NixOS#118731)
since the commit

    5e2311d - musl: 1.2.1 -> 1.2.2

concretely, musl commit

    01c7920f - remove redundant pthread struct members repeated for layout purposes

which I suspect breaks some glibc/musl ABI compatibility that may have
existed accidentally until then.

The added

    lib.optional stdenv.targetPlatform.isMusl "pie";

also fixes that the packaged bindist compiler cannot create a binary
in its `installCheck` phase (and overall); see detail explanation
in NixOS#129247.
nh2 added a commit to nh2/nixpkgs that referenced this issue Jul 5, 2021
…compiler.

This addresses the fact that `ghc865Binary` segfaults on musl
(see NixOS#118731) because of the glibc+musl mix used in there.

With the previous commits, `ghc8102Binary` was changed to use
the musl-based bindist from GHC HQ instead, which works.

With this change, all nix Haskell compilers builds on musl:

    NIX_PATH=nixpkgs=. nix-build --no-link --expr 'with import <nixpkgs> {}; { inherit (pkgsMusl.haskell.compiler) ghc884 ghc8104 ghc901 ghcHEAD; }'
nh2 added a commit to nh2/nixpkgs that referenced this issue Jul 10, 2021
…compiler.

This addresses the fact that `ghc865Binary` segfaults on musl
(see NixOS#118731) because of the glibc+musl mix used in there.

With the previous commits, `ghc8102Binary` was changed to use
the musl-based bindist from GHC HQ instead, which works.

With this change, all nix Haskell compilers builds on musl:

    NIX_PATH=nixpkgs=. nix-build --no-link --expr 'with import <nixpkgs> {}; { inherit (pkgsMusl.haskell.compiler) ghc884 ghc8104 ghc901 ghcHEAD; }'
sternenseemann pushed a commit to sternenseemann/nixpkgs that referenced this issue Jul 24, 2021
…OS#118731 NixOS#129247.

This commit replaces the musl + glibc hackery in the GHC bindist
compiler by using the new musl based bindist that GHC HQ provides
(built on Alpine).
We could alternatively also use a nix-built musl boostrap compiler,
but it seems nicer to use the GHC HQ one for now.

This fixes the compiler built by
`pkgsMusl.haskell.compiler.ghc8102Binary` segfaulting (NixOS#118731)
since the commit

    5e2311d - musl: 1.2.1 -> 1.2.2

concretely, musl commit

    01c7920f - remove redundant pthread struct members repeated for layout purposes

which I suspect breaks some glibc/musl ABI compatibility that may have
existed accidentally until then.

The added

    lib.optional stdenv.targetPlatform.isMusl "pie";

also fixes that the packaged bindist compiler cannot create a binary
in its `installCheck` phase (and overall); see detail explanation
in NixOS#129247.
@srid
Copy link
Contributor

srid commented Jul 28, 2021

Trying nixpkgs post this commit, I get

ln: failed to access 'libpython3.9.so.1.0': No such file or directory

GHC 8.10 uses Python as a library, which fails to build (libpython apparently is being built statically). More context here: srid/neuron#626 (comment)

Not sure what I'm missing ... is there anything in particular to do here?

@sternenseemann
Copy link
Member

sternenseemann commented Jul 28, 2021

This is known, but unrelated to our GHC derivation, see #131557. It fails while building python3, not GHC.

One workaround could be to build with an overlay like this for now:

self: super: {
  python3 = super.python3.override { enableLTO = false; };
}

@srid
Copy link
Contributor

srid commented Jul 28, 2021

@sternenseemann Thanks; that workaround did it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants