Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nix installer failed on MacOS #7702

Open
1 of 3 tasks
Alizter opened this issue Jan 28, 2023 · 4 comments
Open
1 of 3 tasks

Nix installer failed on MacOS #7702

Alizter opened this issue Jan 28, 2023 · 4 comments

Comments

@Alizter
Copy link

Alizter commented Jan 28, 2023

Platform

I am getting CI failures on multiple projects using Nix on MacOS.

  • Linux:
  • macOS
  • WSL

Additional information

Output

Output
  ~~> Setting up the default profile
  installing 'nix-2.13.2'
  building '/nix/store/kbr12hx48bb2l9sh75kx9xh91b2kvyiy-user-environment.drv'...
  installing 'nss-cacert-3.83'
  error: unexpected EOF reading a line
  (use '--show-trace' to show detailed location information)
  
  ---- oh no! --------------------------------------------------------------------
  Oh no, something went wrong. If you can take all the output and open
  an issue, we'd love to fix the problem so nobody else has this issue.
  
  :(
  
  We'd love to help if you need it.
  
  You can open an issue at
  https://github.com/NixOS/nix/issues/new?labels=installer&template=installer.md
  
  Or get in touch with the community: https://nixos.org/community

Priorities

Add 👍 to issues you find important.

@Alizter Alizter changed the title Nix installer failed with fish on MacOS Nix installer failed on MacOS Jan 28, 2023
@abathur
Copy link
Member

abathur commented Jan 28, 2023

Do the jobs complete if you re-run them?

@Alizter
Copy link
Author

Alizter commented Jan 28, 2023

@abathur I think so, but I thought it was worth reporting anyway.

@abathur
Copy link
Member

abathur commented Jan 28, 2023

If so, you're probably running into:

@Alizter
Copy link
Author

Alizter commented Jan 28, 2023

@abathur yes, in that case you may close this. Thanks!

edolstra added a commit to edolstra/nix that referenced this issue Mar 15, 2023
Hopefully this fixes "unexpected EOF" failures on macOS
(NixOS#3137, NixOS#3605, NixOS#7242, NixOS#7702).

The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:

* The child finishes very quickly. That's why this bug is likely to
  trigger in nix-env tests, since that uses a builtin builder. Adding
  a short sleep before the child exits makes the problem go away.

* The parent has closed its duplicate of the slave file
  descriptor. This shouldn't matter, since the child has a duplicate
  as well, but it does. E.g. moving the close to the bottom of
  startBuilder() makes the problem go away. However, that's not a
  solution because it would make Nix hang if the child dies before
  sending the "sandbox initialized" message.

* The system is under high load. E.g. "make installcheck -j16" makes
  the issue pretty reproducible, while it's very rare under "make
  installcheck -j1".

As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.
Ericson2314 pushed a commit to Ericson2314/nix that referenced this issue Oct 31, 2023
Hopefully this fixes "unexpected EOF" failures on macOS
(NixOS#3137, NixOS#3605, NixOS#7242, NixOS#7702).

The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:

* The child finishes very quickly. That's why this bug is likely to
  trigger in nix-env tests, since that uses a builtin builder. Adding
  a short sleep before the child exits makes the problem go away.

* The parent has closed its duplicate of the slave file
  descriptor. This shouldn't matter, since the child has a duplicate
  as well, but it does. E.g. moving the close to the bottom of
  startBuilder() makes the problem go away. However, that's not a
  solution because it would make Nix hang if the child dies before
  sending the "sandbox initialized" message.

* The system is under high load. E.g. "make installcheck -j16" makes
  the issue pretty reproducible, while it's very rare under "make
  installcheck -j1".

As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.

(cherry picked from commit c536e00)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants