Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize discard-references #8322

Merged
merged 1 commit into from
Aug 7, 2023

Conversation

thufschmitt
Copy link
Member

Motivation

It has been there for a few releases now (landed in 2.14.0), doesn't
seem to cause any major issue and is wanted in a few places
(#7087 (comment)).

Context

The xp feature got introduced in #7087

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests - tests/nixos/*
  • documentation in the manual
  • documentation in the internal API docs
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or incompatible change: updated release notes

Priorities

Add 👍 to pull requests you find important.

@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label May 12, 2023
@thufschmitt thufschmitt mentioned this pull request May 12, 2023
2 tasks
@edolstra
Copy link
Member

I'm not in favor of stabilizing this. The objections I gave in #7087 (comment) and #7087 (comment) still exist, namely that this is a potentially dangerous loosening of Nix's closure invariant. Having it as an experimental feature that people have to opt into makes it clear they're doing something potentially dangerous, and avoids people starting to abuse this (e.g. in Nixpkgs to have optional dependencies or whatever).

@alyssais
Copy link
Member

What solution do you suggest to the problem of disk images with spurious references? Spectrum builds are composed of several layers of disk images (e.g. VM images are built, and then put into a root filesystem image), and when I substitute one of those, Nix will separately, pointlessly, download all the paths included in the image before downloading the image itself, which contains its own copies of all those paths.

@ghost
Copy link

ghost commented May 12, 2023

I don't think this should be stabilized.

The Nix store database's reference graph is just a cache which can be discarded and reconstructed from the store itself. Stabilizing this feature destroys that property.

What solution do you suggest to the problem of disk images with spurious references?

Obfuscate your disk image. Compression, rot13, morse code, an invertible version of nuke-refs -- whatever you like.

Something somewhere has to bear the burden of remembering that a particular outpath is special and exempt from reference-scanning. Stabilizing this feature shifts that burden from the special outpath to Nix itself. This is a lossy conversion: the "specialness bit" is not recoverable from the outpath.

@ghost
Copy link

ghost commented May 12, 2023

Nix will separately, pointlessly, download all the paths included in the image before downloading the image itself,

When one outpath is just a concatenation or tarring up of a bunch of other outpaths, I use preferLocalBuild and allowSubstitutes=false to prevent redundant downloads. This does require careful architecting, but I find that it pays off in the long run.

@thufschmitt
Copy link
Member Author

The Nix store database's reference graph is just a cache which can be discarded and reconstructed from the store itself. Stabilizing this feature destroys that property.

No, it isn't. The reference scanning only detects whatever is in the closure of the derivation so you can embed (nearly) arbitrary references that won't be picked by Nix by hardcoding them properly. For example

$ nix build nixpkgs#hello

and

$ nix store add-path $(nix build --no-link --print-out-paths nixpkgs#hello)

have the exact same content, but a different set of references.

You can even implement a (very inefficient) pure Nix version of discardRefs with a bit of IFD: https://gist.github.com/thufschmitt/b8936f2184c68f4b7c3ea0fe294b64e2

Having it as an experimental feature that people have to opt into makes it clear they're doing something potentially dangerous, and avoids people starting to abuse this

Well, that's not what experimental features are made for, is it? They are meant as a temporary way for trying out a feature, not a permanent gatekeeping measure

@ghost
Copy link

ghost commented May 15, 2023

The reference scanning only detects whatever is in the closure of the derivation

That distinction mattered back when Nix lacked a sandbox... In an alternate (better?) universe where Nix had a sandbox from the very beginning we might never have needed string contexts.

$ nix store add-path $(nix build --no-link --print-out-paths nixpkgs#hello)

I wouldn't object to changes which break this pathological example, and in fact I happen to have one in mind.

Well, that's not what experimental features are made for, is it?

If you'd like to rename experimental to unstable I would 👍 that.

@alyssais
Copy link
Member

alyssais commented May 15, 2023 via email

@thufschmitt
Copy link
Member Author

I wouldn't object to changes which break this pathological example, and in fact I happen to have one in mind.

That one is a bit contrived indeed, but just nixpkgs already contains several hundreds of literal store paths in its source:

$ grep -R -E '/nix/store/[a-z0-9]{32}-' $(nix eval --raw nixpkgs#path) | wc -l
314

That being said, what's the change you have in mind?

That distinction mattered back when Nix lacked a sandbox... In an alternate (better?) universe where Nix had a sandbox from the very beginning we might never have needed string contexts.

I'm not sure what you mean by that. What's the link with string contexts?

@7c6f434c
Copy link
Member

$ grep -R -E '/nix/store/[a-z0-9]{32}-' $(nix eval --raw nixpkgs#path) | wc -l
314

So, let's breakdown this 314.

nixpkgs/nixos/modules/installer/tools/nix-fallback-paths.nix

(used in nixos-rebuild to fetch from cache by path)

nixpkgs/pkgs/development/python-modules/blspy/dont_fetch_dependencies.patch patching literal paths already referenced as fixed-output source.

magic with various representations of the same store path for the empty file in nixpkgs/nixos/tests/systemd-escaping.nix

JSON files about fetching which mention the store path used only for updating not for evaluation.

Various kinds of documentation.

eeee, 0000, xxxx, … fake paths

Overall: effectively, there is one true instance, two more questionable instances, and 311 cases that are not actually used in evaluation. I could have missed one more semi-true instance?

So the impact of changing the behaviour here is absolutely manageable

@thufschmitt
Copy link
Member Author

Overall: effectively, there is one true instance, two more questionable instances, and 311 cases that are not actually used in evaluation. I could have missed one more semi-true instance?

Maybe I misunderstood what you meant here, but them being used in evaluation or not doesn't change anything. My point was simply that if we were to refscan nixpkgs assuming that everything that looks like a store path in it is indeed a reference, then we're in trouble. And that the literal store path references in Nix store paths aren't the source of truth for a good reason.

@7c6f434c
Copy link
Member

My point was simply that if we were to refscan nixpkgs assuming that everything that looks like a store path in it is indeed a reference, then we're in trouble.

If we scan store paths searching for things identical to other present store paths, we will actually do pretty OK.

@thufschmitt
Copy link
Member Author

But that wouldn't be deterministic, right? something like writeText "foo" "${builtins.unsafeDiscardStringContext hello.outPath}" would be different depending on whether hello is in store or not

@7c6f434c
Copy link
Member

7c6f434c commented May 17, 2023 via email

@ghost
Copy link

ghost commented Jun 8, 2023

you'd have to make sure you never used a substituter that was configured to discard references.

It's part of the Nix trust model that you have to trust your substituters to faithfully realize the derivations you send to them, including much worse things like not injecting malware into outpaths.

Do any substituters existing today corrupt the references in the way you describe? I would consider this a serious bug in the substituter!

Something somewhere has to bear the burden of remembering that a particular outpath is special and exempt from reference-scanning. Stabilizing this feature shifts that burden from the special outpath to Nix itself. This is a lossy conversion: the "specialness bit" is not recoverable from the outpath.

Here's another possibility: represent the "ignore references" bit explicitly in the outpath. It's ugly, but I suppose a magic file in nix-support would work.

The trickiest part about this approach is keeping all the different parts in sync, since the "discard references" decision comes from the derivation but is also represented in the outpath. IMHO the best solution would be to eliminate unsafeDiscardReferences and instead let the builder decide whether or not references should be discarded by creating or not creating the magic file:

postFixup = ''
  touch $out/nix-support/discard-references-${builtins.hashString "sha256" "discard-references"}
'';

Then there is no synchronization problem, since the derivation no longer has a unsafeDiscardReferences bit.

@ghost
Copy link

ghost commented Jun 8, 2023

I find it easiest to understand Nix's worldview using the analogy to the Boehm-Demers-Weiser Garbage Collector from section 3.4 of @edolstra's thesis.

Using that analogy, discard-references corresponds to GC_MALLOC_ATOMIC, also known as the set of "scan-blacklisted pages".

@ncfavier
Copy link
Member

ncfavier commented Jun 8, 2023

IMHO the best solution would be to eliminate unsafeDiscardReferences and instead let the builder decide whether or not references should be discarded by creating or not creating the magic file:

I agree and initially suggested this early in the other thread, but it didn't catch on.

@ghost
Copy link

ghost commented Jun 9, 2023

I agree and initially suggested this early in the other thread, but it didn't catch on.

Hrm, that seems pretty complicated...

What I'm suggesting is much simpler:

  1. a boolean (scan or not) instead of the builder dictating the set of references
  2. put the boolean in the outpath instead of (not in addition to) putting it in the .drv

@ncfavier
Copy link
Member

ncfavier commented Jun 9, 2023

2. instead of (not in addition to)

That is what I meant

@alyssais
Copy link
Member

alyssais commented Jun 9, 2023

2. put the boolean in the outpath instead of (not in addition to) putting it in the .drv

That wouldn't work for file store paths, just directories.

@thufschmitt
Copy link
Member Author

This was discussed in the Nix team meeting last week. We agreed on merging it so I'll fix the conflicts and merge.

Discussion log - @edolstra: Still concerned about optional references like people abusing this to build pluggable derivations that would have undeclared reference to other store paths (and would behave differently depending on the presence of these) - But the fact that it's all-or-nothing (can't selectively discard only a subset of the derivations) makes this much less dangerous in that regard - @edolstra: Let's stabilize this - Agreement

It has been there for a few releases now (landed in 2.14.0), doesn't
seem to cause any major issue and is wanted in a few places
(NixOS#7087 (comment)).
@thufschmitt thufschmitt merged commit 4999f42 into NixOS:master Aug 7, 2023
8 checks passed
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-07-31-nix-team-meeting-minutes-76/31486/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
with-tests Issues related to testing. PRs with tests have some priority
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants