Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdenv/check-meta: Use bespoke type checking #273935

Merged
merged 1 commit into from
Jan 2, 2024

Conversation

adisbladis
Copy link
Member

Description of changes

Aka checkMeta goes brrr.

Using the module system type checking works OK & generates good error messages. The performance of using it however is terrible because of the value merging it does being very allocation heavy.

By implementing a very minimal type checker we can drastically improve the performance when nixpkgs is evaluated with checkMeta = true.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@adisbladis adisbladis requested a review from a user December 13, 2023 09:01
@github-actions github-actions bot added the 6.topic: stdenv Standard environment label Dec 13, 2023
@adisbladis adisbladis force-pushed the lib-meta-typechecks-go-brrr branch 3 times, most recently from 2b879cf to 479f994 Compare December 13, 2023 09:08
@adisbladis
Copy link
Member Author

Performance deltas in percent with checkMeta = true:

{
  "cpuTime": -37.616098264488,
  "envs": {
    "bytes": -60.48647877463932,
    "elements": -58.70869721305459,
    "number": -61.640492652489094
  },
  "gc": {
    "heapSize": -13.504585796123479,
    "totalBytes": -47.230913237101554
  },
  "list": {
    "bytes": -46.429331283519076,
    "concats": -1.955389471033186,
    "elements": -46.429331283519076
  },
  "nrAvoided": -56.492678214267194,
  "nrFunctionCalls": -60.23075973485585,
  "nrLookups": -70.16279883133012,
  "nrOpUpdateValuesCopied": -0.0003359492599912528,
  "nrOpUpdates": -0.009558341420273564,
  "nrPrimOpCalls": -56.55747152917036,
  "nrThunks": -58.26076383934929,
  "sets": {
    "bytes": -25.267461404168813,
    "elements": -20.308650382976552,
    "number": -62.95486381264912
  },
  "sizes": {
    "Attr": 0.0,
    "Bindings": 0.0,
    "Env": 0.0,
    "Value": 0.0
  },
  "symbols": {
    "bytes": 0.0,
    "number": 0.0
  },
  "values": {
    "bytes": -55.43333546966257,
    "number": -55.43333546966257
  }
}
  • before
{
  "cpuTime": 1077.489013671875,
  "envs": {
    "bytes": 49689445008,
    "elements": 2444846098,
    "number": 1883167264
  },
  "gc": {
    "heapSize": 57147367424,
    "totalBytes": 207745890128
  },
  "list": {
    "bytes": 4860777344,
    "concats": 56454789,
    "elements": 607597168
  },
  "nrAvoided": 2000169294,
  "nrFunctionCalls": 1704271589,
  "nrLookups": 1213447900,
  "nrOpUpdateValuesCopied": 1525527992,
  "nrOpUpdates": 71037429,
  "nrPrimOpCalls": 849863362,
  "nrThunks": 2155122280,
  "sets": {
    "bytes": 45198154992,
    "elements": 2496413102,
    "number": 328471585
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 16,
    "Value": 24
  },
  "symbols": {
    "bytes": 2199594,
    "number": 160362
  },
  "values": {
    "bytes": 62812041168,
    "number": 2617168382
  }
}
  • after
{
  "cpuTime": 672.1796875,
  "envs": {
    "bytes": 19634049400,
    "elements": 1009508805,
    "number": 722373685
  },
  "gc": {
    "heapSize": 49429852160,
    "totalBytes": 109625609008
  },
  "list": {
    "bytes": 2603950928,
    "concats": 55350878,
    "elements": 325493866
  },
  "nrAvoided": 870220091,
  "nrFunctionCalls": 677775863,
  "nrLookups": 362058891,
  "nrOpUpdateValuesCopied": 1525522867,
  "nrOpUpdates": 71030639,
  "nrPrimOpCalls": 369202133,
  "nrThunks": 899531578,
  "sets": {
    "bytes": 33777728624,
    "elements": 1989425293,
    "number": 121682746
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 16,
    "Value": 24
  },
  "symbols": {
    "bytes": 2199594,
    "number": 160362
  },
  "values": {
    "bytes": 27993231672,
    "number": 1166384653
  }
}

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Dec 13, 2023
@roberth
Copy link
Member

roberth commented Dec 13, 2023

Good thing these checks were off by default.


As mentioned in discussions with @piegamesde in earlier threads, I'm open to the idea of adding such checks to the module system types. Conceptually they are checks on the output of the type, and normally they don't have to run because merge has the responsibility to return a correct instance of the output type.
cc @infinisil wdyt?

@roberth
Copy link
Member

roberth commented Dec 13, 2023

Good thing these checks were off by default.

I suspect we'll want to keep them off by default, but what's the performance loss of enabling checks, with this PR?

@infinisil
Copy link
Member

As mentioned in discussions with @piegamesde in earlier threads, I'm open to the idea of adding such checks to the module system types. Conceptually they are checks on the output of the type, and normally they don't have to run because merge has the responsibility to return a correct instance of the output type.
cc @infinisil wdyt?

I think that could work. The types of the module system don't seem to be specific to the module system, so it could make sense to make them usable outside of it. It should be properly verified in some way though. Assuming we have some type.deepCheck value that implements that, it should return the same result as lib.modules.mergeDefinitions [ ... ] type [ { value = value; } ]. Ideally deepCheck should give equally good error messages and share as much code as possible.

@adisbladis
Copy link
Member Author

Good thing these checks were off by default.

I suspect we'll want to keep them off by default, but what's the performance loss of enabling checks, with this PR?

I ran the numbers on the overhead of checkMeta = true after this PR:

Performance deltas in percent comparing checkMeta = true to checkMeta = false.

{
  "cpuTime": -8.944505920796743,
  "envs": {
    "bytes": -7.666594079213567,
    "elements": -6.213269713964095,
    "number": -8.684306618052958
  },
  "gc": {
    "heapSize": -7.122003986862282,
    "totalBytes": -3.077962182592728
  },
  "list": {
    "bytes": -0.9441081113695446,
    "concats": -0.05202889011270884,
    "elements": -0.9441081113695446
  },
  "nrAvoided": -7.925363737034942,
  "nrFunctionCalls": -9.244428476735564,
  "nrLookups": -4.37264879220136,
  "nrOpUpdateValuesCopied": -0.0593690080884528,
  "nrOpUpdates": -0.030386631784779183,
  "nrPrimOpCalls": -17.34110259872277,
  "nrThunks": -3.972594962074794,
  "sets": {
    "bytes": -0.13406614190063237,
    "elements": -0.128016009634365,
    "number": -0.2329881140299932
  },
  "sizes": {
    "Attr": 0.0,
    "Bindings": 0.0,
    "Env": 0.0,
    "Value": 0.0
  },
  "symbols": {
    "bytes": -0.0006364507143814535,
    "number": -0.0012471004913550132
  },
  "values": {
    "bytes": -3.3347500916452617,
    "number": -3.3347500916452475
  }
}
  • checkMeta = false
{
  "cpuTime": 641.2516479492188,
  "envs": {
    "bytes": 18012960072,
    "elements": 941940635,
    "number": 654839687
  },
  "gc": {
    "heapSize": 49446887424,
    "totalBytes": 106000972848
  },
  "list": {
    "bytes": 2542967952,
    "concats": 55338509,
    "elements": 317870994
  },
  "nrAvoided": 800921767,
  "nrFunctionCalls": 610322643,
  "nrLookups": 317217530,
  "nrOpUpdateValuesCopied": 1525159574,
  "nrOpUpdates": 71032681,
  "nrPrimOpCalls": 303927522,
  "nrThunks": 861588432,
  "sets": {
    "bytes": 33745374416,
    "elements": 1987647838,
    "number": 121438063
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 16,
    "Value": 24
  },
  "symbols": {
    "bytes": 2199685,
    "number": 160370
  },
  "values": {
    "bytes": 27011663736,
    "number": 1125485989
  }
}
  • checkMeta = true
{
  "cpuTime": 704.24267578125,
  "envs": {
    "bytes": 19508605680,
    "elements": 1004343186,
    "number": 717116262
  },
  "gc": {
    "heapSize": 53238538240,
    "totalBytes": 109367255616
  },
  "list": {
    "bytes": 2567205144,
    "concats": 55367316,
    "elements": 320900643
  },
  "nrAvoided": 869861451,
  "nrFunctionCalls": 672490551,
  "nrLookups": 331722594,
  "nrOpUpdateValuesCopied": 1526065584,
  "nrOpUpdates": 71054272,
  "nrPrimOpCalls": 367688817,
  "nrThunks": 897231818,
  "sets": {
    "bytes": 33790676272,
    "elements": 1990195607,
    "number": 121721660
  },
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 16,
    "Value": 24
  },
  "symbols": {
    "bytes": 2199699,
    "number": 160372
  },
  "values": {
    "bytes": 27943509960,
    "number": 1164312915
  }
}

@adisbladis
Copy link
Member Author

As mentioned in discussions with @piegamesde in earlier threads, I'm open to the idea of adding such checks to the module system types. Conceptually they are checks on the output of the type, and normally they don't have to run because merge has the responsibility to return a correct instance of the output type.

That aside, I don't want this PR to delve into a discussion around how the module system types could be improved.

@adisbladis
Copy link
Member Author

I've generalized this into a reusable type system: https://github.com/adisbladis/korora.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/korora-a-tiny-fast-type-system-for-nix-in-nix/36900/1

@adisbladis
Copy link
Member Author

I'm a bit miffed at the response (or lack thereof), which I suspect may stem from a small misunderstanding of the impact:

Good thing these checks were off by default.

While these checks are "off by default", OfBorg runs a pass with these checks enabled so this should mean significantly better PR throughput, and much less OfBorg wait times.

@roberth
Copy link
Member

roberth commented Dec 18, 2023

I'm a bit miffed at the response (or lack thereof) [...]

Good thing these checks were off by default.

That's not at all what I intended. I really just made the observation, from my perspective.

which I suspect may stem from a small misunderstanding of the impact:

I rarely feel blocked by ofborg, because I tend to have a lot of work in flight anyway. That's a probably a bias of mine, but that doesn't even mean that I don't like this PR or anything. I'm really glad that you've found this!

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the PR ok from a library perspective.
I'd prefer to just resolve the tech debt right away, but I can accept it this way.
For the rest of the review, I have to defer to @piegamesde and/or others.

pkgs/stdenv/generic/meta-types.nix Outdated Show resolved Hide resolved
@piegamesde
Copy link
Member

I am okay with the changes. While I generally am also for reducing tech debt where possible, I acknowledge that this touches on a lot of parts of Nixpkgs where I honestly don't think it's easily possible.

One small question though: union takes a list of types, yet like either it is mostly instantiated with exactly two types. What would be the performance impact of adding a special case there too? (IIRC the big module system has either for two types and oneOf for more. I could live with union and union2 or union and either, or even just union where the special casing happens internally and transparently.)

@infinisil
Copy link
Member

While these checks are "off by default", OfBorg runs a pass with these checks enabled so this should mean significantly better PR throughput, and much less OfBorg wait times.

Oh thanks for pointing this out, this definitely isn't obvious!

Considering that, I'm generally okay with a PR like this. I do think this new type system should be indicated to be very internal though. We don't want third-parties to import the file and start depending on it.

Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor stuff, overall it looks good to me!

pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved
pkgs/stdenv/generic/check-meta.nix Show resolved Hide resolved
pkgs/stdenv/generic/check-meta.nix Show resolved Hide resolved
pkgs/stdenv/generic/meta-types.nix Outdated Show resolved Hide resolved
Aka `checkMeta` goes brrr.

Using the module system type checking works OK & generates good error messages.
The performance of using it however is terrible because of the value merging it does being very allocation heavy.

By implementing a very minimal type checker we can drastically improve the performance when nixpkgs is evaluated with `checkMeta = true`.
@adisbladis
Copy link
Member Author

One small question though: union takes a list of types, yet like either it is mostly instantiated with exactly two types. What would be the performance impact of adding a special case there too? (IIRC the big module system has either for two types and oneOf for more. I could live with union and union2 or union and either, or even just union where the special casing happens internally and transparently.)

I haven't measured it, but I think it should have ~0 performance impact:
The list of the union is allocated when the type is created and then re-used.
The only difference then would be whether the "compiled" type would execute as

t1: t2: v: t1.verify v || t2.verify v

or

t1: t2: let
  union' = [ t1 t2 ];
in v: any (t: t.verify v) union'

@adisbladis
Copy link
Member Author

I'll self-merge this soon unless someone else feels obliged.

@roberth roberth merged commit af62e8e into NixOS:master Jan 2, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: stdenv Standard environment 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants