Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a nixpkgs_packageset rule #49

Open
thufschmitt opened this issue Nov 27, 2018 · 11 comments
Open

Add a nixpkgs_packageset rule #49

thufschmitt opened this issue Nov 27, 2018 · 11 comments
Labels
P3 minor: not priorized type: feature request

Comments

@thufschmitt
Copy link
Contributor

Since tweag/rules_haskell#442, the evaluation of the workspace of a rules_haskell-enabled repository takes an insane amount of time because it requires nix-build-ing every transitive haskell dependency, which easily means more than 200 nix-build calls.
We could speed things a lot by first nix-instantiate-ing a big json file mapping each haskell package to its .drv, and then just realize as needed these pre-intstantiated derivations. This would avoid the cost of evaluating nixpkgs 200 times (which is by far the bottleneck when everything is already built).

This trick can probably be used for more than just the haskellPackages case, so I propose we add a nixpkgs_packages_set rule to rules_nixpkgs which would be used like:

nixpkgs_packages_set(
  name = "haskellPackages",
  repository = "@nixpkgs",
  base_attribute_path = "haskellPackages",
)

This would generate @haskellPackages-base, @haskellPackages-streaming,
@haskellPackages-foobar, … (and possibly one @haskellPackages with aliases
in it for all of these.

Internally this rule would

  1. nix-instantiate a json file of the form

    {
      "base": "/nix/store/…-base.drv",
      "streaming": "/nix/store/…-streaming.drv",
      "foobar": "/nix/store/…-foobar.drv",
      
    }
  2. Load this into a starlark dict one way or another

  3. For each element of this dict, generate a call to a rule which would nix-store --realize the corresponding derivation

All this would require extending a bit the scope of nixpkgs (or doing thing in an ad-hoc way, not sure what's the best choice), to add

  • A nix_instantiate rule allowing to import the result of nix-instantiate
  • A nix_realize rule which would wrap nix-store --realize

Thoughs?

@mboes
Copy link
Member

mboes commented Nov 27, 2018

I think this is a good idea. Though to be useful the json file you mention should be checked into the source, right? So that even the initial checkout isn't too slow? @zimbatm pointed out to me yesterday that this would be similar to Yarn lock files. And it so happens that @aehlig proposed to do something similar here and here.

@thufschmitt
Copy link
Contributor Author

I didn't think of checking it. I think it would be a benefit even without doing it.

Some really quick tests I've done indicate that for haskellPackages, this approach starts being faster than nix-build-ing each one individually with 30 packages and is almost 6x faster with 200 packages (on my machine, nix-build handles roughly 100 packages/min while nix-instantiate takes ~15s and the calls to nix-store --realise are negligible).

Checking that into the repo would make it almost instantaneous, at the cost of having to handle a generated file.

@zimbatm
Copy link

zimbatm commented Nov 28, 2018

@regnat how many calls to nix-instantiate are being made? Theoretically you could get all your attributes in one call by concatenating the -A attrpath on a single nix-instantiate.

Pardon my bazel:

nixpkgs_instantiate(
  name = "myPackages",
  repository = "@nixpkgs",
  attributes = [
    "haskellPackages.ghc",
    "hello"
  ]
)

Then I imagine that this rule would have multiple outputs, one for each attribute.

This approach should be faster even on a fresh install with anything > 2 packages.

@Profpatsch
Copy link
Contributor

Profpatsch commented Nov 28, 2018

Theoretically you could get all your attributes in one call by concatenating the -A attrpath on a single nix-instantiate.

Can we assume the output is stable and is going to stay so over nix releases?

$ nix-instantiate -A hello -A binutils -A ghc
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/258wysx7s44xf90i2ww5h54h8745blym-hello-2.10.drv
/nix/store/s4n633q0lmqm70f22k3chp8kkn4nsql9-binutils-wrapper-2.30.drv
/nix/store/12bkksv14ns4p1xga5vw7wkvpj9kmzvn-ghc-8.4.4.drv

$ nix-instantiate -A ghc -A hello -A binutils
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/12bkksv14ns4p1xga5vw7wkvpj9kmzvn-ghc-8.4.4.drv
/nix/store/258wysx7s44xf90i2ww5h54h8745blym-hello-2.10.drv
/nix/store/s4n633q0lmqm70f22k3chp8kkn4nsql9-binutils-wrapper-2.30.drv

I’d go a step further and make this possible:

foo.nix

with import ./nixpkgs {};
{ 
  # attrset of packages
  pythonPackages = { inherit (pythonPackages) a b c d e; };
  # one derivation
  bundledGhc = haskellPackages.ghcWithPackages (h: with h; [ lens aeson ]);
}

WORKSPACE

nixpkgs_repository_cache(
  name = "foo_cache",
  repository = ":foo.nix",
  cached_attributes = [
    # can reference the attribute set here
    "pythonPackages",
    "bundledGhc",
  ],
)

nixpkgs_package(
  name = "python-a"
  # every .drv from the attribute set is accessible from the cache
  attribute_path = "pythonPackages.a"
)

So users can reference whole attribute sets (recursively?) and it will cache all subderivations.
We can also follow hydra and only recurse into attribute sets marked with recurseIntoAttrs. The code generating the list of all attributes can be written in nix and output with nix-instantiate --eval. If you want I can whip it up.

@zimbatm
Copy link

zimbatm commented Nov 28, 2018

Sounds great!

I think the nixpkgs_package would have to take a cache name as input to establish the link(?)

You probably know already, nix-instantiate has a --json output that might come handy.

--- EDIT ---

Oops forgot to answer:

Can we assume the output is stable and is going to stay so over nix releases?

Since nix doesn't give any other output mapping capability than the ordering I would be the first to complain if that started to break.

@thufschmitt
Copy link
Contributor Author

how many calls to nix-instantiate are being made?

Only one: I nix-instantiate --json --eval --strict --read-write-mode haskellPackagesToJson.nix where haskellPackagesToJson.nix contains

with import <nixpkgs> {};

let
  evaluateElement = x:
    let result = builtins.tryEval (x.drvPath or null); in
    if result.success == true then result.value else null;
in
(builtins.mapAttrs (_: evaluateElement) haskellPackages)

@Profpatsch If I understand correctly you suggest saving the .drvs for the whole of nixpkgs at once? I've considered that, but since just doing so for the haskellPackages set already takes 15s, I fear that's gonna be too slow to be usable in practice

@mboes
Copy link
Member

mboes commented Nov 29, 2018

Conceivably, Hydra could generate that for all of Nixpkgs though. Then we don't even need to checkin anything. @zimbatm isn't that what you have setup in some private Hydra instance?

@thufschmitt
Copy link
Contributor Author

Unless I'm mistaken, the time taken by nix-instantiate is the time needed to evaluate the nix expression, which isn't something that hydra can cache

@Profpatsch
Copy link
Contributor

Profpatsch commented Nov 29, 2018

If I understand correctly you suggest saving the .drvs for the whole of nixpkgs at once

Nope, that would stumble over packages marked as broken (run nix-instantiate '<nixpkgs>' to see an example), also it would take way to long as you said. I suggest users project the packages they need via a nix expression and then cache that.

@zimbatm
Copy link

zimbatm commented Dec 2, 2018

@Profpatsch's approach seems the best as it's independent of any infrastructure requirements. For example if the user wants to provide it's own overrides it wouldn't be able to fetch things from the public Hydra anyways.

@thufschmitt thufschmitt mentioned this issue Dec 4, 2018
3 tasks
@thufschmitt
Copy link
Contributor Author

I have a POC implementation in #50, feel free to take a look and criticize it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 minor: not priorized type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants