Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NAR tests on Linux+ZFS+normalize #11609

Merged
merged 2 commits into from
Oct 7, 2024
Merged

Conversation

yannham
Copy link
Contributor

@yannham yannham commented Sep 27, 2024

Closes #11595.

Motivation & context

A test added recently checks that when trying to deserialize a NAR with two files that Unicode-normalize to the same result either succeeds on Linux, or fails with an "already exists" error on Darwin. However, failing with an "already exists" error can in fact also happen on Linux, when using ZFS with the proper utf8 and Unicode normalization options set.

This commit fixes the issue by not assuming the behavior from the current system, but just by blindly checking that either one of the two aforementioned possibilities happen, whether on Darwin or on Linux.

This has been tested on a Linux+ZFS+normalize runner here: https://github.com/tweag/nickel/actions/runs/11073351502/job/30769680803?pr=2037

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Sep 27, 2024
Copy link
Contributor

@Gerg-L Gerg-L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works on my machine with normalization set to formD lgtm

@puckipedia
Copy link
Contributor

There's a bit of an issue with this actually; UTF-8 normalisation means that nix builds done on this system may inherently end up with a different NAR hash than those done on a system without normalisation. This may also present as nar files from cache.nixos not having the same nar hash.

@tomberek
Copy link
Contributor

There's a bit of an issue with this actually; UTF-8 normalisation means that nix builds done on this system may inherently end up with a different NAR hash than those done on a system without normalisation. This may also present as nar files from cache.nixos not having the same nar hash.

Is that an issue with this test? or with NAR deserialization?

@yannham
Copy link
Contributor Author

yannham commented Sep 30, 2024

The thing is, the test as it was originally added (before this PR) precludes the installation and usage of Nix on a Linux system with ZFS+normalize (excluding work-arounds like disabling checks). This didn't use to be the case.

What is the consequence the NAR hashes mismatch? Unless the answer is that the mismatch can lead to correctness issues and is severe enough to imply that Nix shouldn't be used on such a configuration, at first sight this is a separate issue from this PR that just tries to restore a previously working combination.

Out of curiosity, why isn't this a problem for Darwin? I suppose because all Darwin have the same behavior with respect to unicode normalization while Linux can have both?

@Mic92
Copy link
Member

Mic92 commented Sep 30, 2024

Search for the use-case-hack nix option, that is used as a mitigation. I would also say that fixing this test is independent from having this option enabled or using these machines as a builder for a binary cache.

@roberth
Copy link
Member

roberth commented Sep 30, 2024

To summarize the underlying issue:

  • You may need a similar hack to the case hack in order to use such a store transparently as a vessel for deployments to stores that do not have this normalization.
  • This is also an insidious build impurity (and those are a fact of life)

For now, let's make the tests pass.

We could gather related ideas in

# `utf8only` enabled and `normalization` set to anything else than `none`).
# The deserialization should succeed on most Linux.
#
# We test that it either succeeds or fails with "already exists" error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we determine the properties of the $TEST_ROOT's file system by creating these semi-duplicate files using other means? That way can at least still test that the Nix behavior matches another program's behavior, and for example catch a bad change that causes Nix to normalize on its own accord on a file system where it shouldn't.

Something like

touch $TEST_ROOT/poll-unicode-â
touch $TEST_ROOT/poll-unicode-â
n=$(wc -l $(ls $TEST_ROOT | grep poll-unicode-))
normalizing=$((n == 1))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine the second touch would error out, in the same way as nix-store --restore. But we can still test that touch and nix-store --restore have the same result, at least 👍

@tomberek tomberek added the idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. label Sep 30, 2024
@yannham yannham force-pushed the fix/nar-test-zfs branch 3 times, most recently from 10c8ea8 to 43bafd9 Compare September 30, 2024 14:24
A test added recently checks that when trying to deserialize a NAR with
two files that Unicode-normalize to the same result either succeeds on
Linux, or fails with an "already exists" error on Darwin. However,
failing with an "already exists" error can in fact also happen on Linux,
when using ZFS with the proper utf8 and Unicode normalization options
set.

This commit fixes the issue by not assuming the behavior from the
current system, but just by blindly checking that either one of the two
aforementioned possibilities happen, whether on Darwin or on Linux.

Additionally, we check that the Unicode normalization behaviour of
nix-store is the same as the host file system.
@yannham
Copy link
Contributor Author

yannham commented Sep 30, 2024

I implemented @roberth 's suggestion in the latest force push, tested again on a Linux+ZFS+utf8 normalization runner. I also changed the tests of the return code in the if-then-else from [[ $code == n ]] to (( code == n )), which I believe is more appropriate.

@yannham yannham requested a review from roberth October 2, 2024 14:11
@yannham
Copy link
Contributor Author

yannham commented Oct 7, 2024

Dear maintainers, is there anything else to be done here? It's probably not a very important change but I fear that if it doesn't move forward it's going to be forgotten for ever 😅 pinging @roberth once again (sorry if inappropriate)

Copy link
Contributor

@Gerg-L Gerg-L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works on my system with formD

tests/functional/nars.sh Outdated Show resolved Hide resolved
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have written the conditional on touchFilesCount instead, but it'd be equivalent. 👍

@roberth roberth enabled auto-merge October 7, 2024 12:54
@roberth roberth merged commit 26c3fc1 into NixOS:master Oct 7, 2024
11 checks passed
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-10-07-nix-team-meeting-minutes-184/54046/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The nar.sh checks fail for some Linux machines
8 participants