Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS images #786

Closed
adamcstephens opened this issue Sep 5, 2023 · 32 comments · Fixed by #806
Closed

NixOS images #786

adamcstephens opened this issue Sep 5, 2023 · 32 comments · Fixed by #806

Comments

@adamcstephens
Copy link
Contributor

Hello. I've recently improved the NixOS support for building LXC container images and LXD virtual machine images. I originally investigated using distrobuilder to do so, but it was mostly trying to fit a square peg into a round hole. Nix already provides all the necessary build infrastructure that works with NixOS, and distrobuilder makes way too many assumptions about building a typical Linux distro that do not apply to Nix.

My question here is, would it be possible to consume the images that the Nix CI builds so that they can be served through the linuxcontainers image server? Effectively the builder would only need to retrieve the latest tarballs/images that are already being produced.

It looks like NixOS supported was requested way back in #38 and the guidance was to use distrobuilder, but I'm hopeful we can find a compromise that does not meet that guidance. Thanks!

@stgraber
Copy link
Member

stgraber commented Sep 5, 2023

It's not impossible but it's definitely not our preference.

What's the issue with distrobuilder exactly?
There already are quite a few images which are pretty thin repack performed by distrobuilder where a mostly complete rootfs is downloaded from an existing source, gets validated and then only some config files get put in place or in more extreme cases, just the image metadata files.

@adamcstephens
Copy link
Contributor Author

NixOS expects and enforces that configuration is produced by the defined system configuration. For example all of /etc/systemd is symlinks to read-only files in the Nix store, /nix/store/..... So even the act of generating configuration should be done through NixOS tooling and not externally. None of the distrobuilder assumptions around package installation, network configuration, systemd overrides, etc, work with how NixOS expects the system to be configured. Distrobuilder also makes many assumptions about the existence of /usr, /sbin, /lib, that do not exist on a NixOS system.

I asked for some guidance a year ago on discourse but ended up not getting any feedback. One thing I remember (and see in my posts from the time) is that the container images we produce through the NixOS CI aren't even chrootable until this script snippet executes on first boot.

These are just some of the challenges and are due to NixOS being fundamentally incompatible with assumptions that are safe on almost any other distro.

@adamcstephens
Copy link
Contributor Author

Even if distrobuilder was extended to support building NixOS images, all of the relevant configuration will also need to live in the nixpkgs repository. NixOS configuration is delcarative. So for example, if a service were defined in the configuration built during image building, and it was not present when users went to modify the NixOS configuration, any changes would be removed as they would be no longer be in the new desired configuration.

@stgraber
Copy link
Member

stgraber commented Sep 5, 2023

Do you have a functional NixOS image that will work through lxc image import?

@adamcstephens
Copy link
Contributor Author

Yes, I have images building for containers and virtual machines on both x86_64 and aarch64. For example here's the latest x86_64 container. If we want to replace the metadata, that's fine with me.

root: https://hydra.nixos.org/job/nixos/trunk-combined/nixos.lxdContainerImage.x86_64-linux/latest/download-by-type/file/system-tarball
metadata: https://hydra.nixos.org/job/nixos/trunk-combined/nixos.lxdContainerMeta.x86_64-linux/latest/download-by-type/file/system-tarball

@stgraber
Copy link
Member

@adamcstephens can you test:

And see if that works for you?

If it does, then I have a basic distrobuilder source and YAML that will pull your image and generate an equivalent metadata to yours.

@stgraber
Copy link
Member

distrobuilder:

diff --git a/shared/definition.go b/shared/definition.go
index 84eb3dc..c8294f5 100644
--- a/shared/definition.go
+++ b/shared/definition.go
@@ -382,6 +382,7 @@ func (d *Definition) Validate() error {
                "rockylinux-http",
                "vyos-http",
                "slackware-http",
+               "nixos-http",
        }
 
        if !shared.StringInSlice(strings.TrimSpace(d.Source.Downloader), validDownloaders) {
diff --git a/sources/nixos-http.go b/sources/nixos-http.go
new file mode 100644
index 0000000..0ad3fbb
--- /dev/null
+++ b/sources/nixos-http.go
@@ -0,0 +1,28 @@
+package sources
+
+import (
+       "fmt"
+       "path/filepath"
+
+       "github.com/lxc/distrobuilder/shared"
+)
+
+type nixos struct {
+       common
+}
+
+func (s *nixos) Run() error {
+       tarballURL := fmt.Sprintf("https://hydra.nixos.org/job/nixos/trunk-combined/nixos.lxdContainerImage.%s-linux/latest/download-by-type/file/system-tarball", s.definition.Image.ArchitectureMapped)
+
+       fpath, err := s.DownloadHash(s.definition.Image, tarballURL, "", nil)
+       if err != nil {
+               return fmt.Errorf("Failed downloading tarball: %w", err)
+       }
+
+       err = shared.Unpack(filepath.Join(fpath, "system-tarball"), s.rootfsDir)
+       if err != nil {
+               return fmt.Errorf("Failed unpacking rootfs: %w", err)
+       }
+
+       return nil
+}
diff --git a/sources/source.go b/sources/source.go
index 9e2e3b9..906ec7a 100644
--- a/sources/source.go
+++ b/sources/source.go
@@ -36,6 +36,7 @@ var downloaders = map[string]func() downloader{
        "fedora-http":          func() downloader { return &fedora{} },
        "funtoo-http":          func() downloader { return &funtoo{} },
        "gentoo-http":          func() downloader { return &gentoo{} },
+       "nixos-http":           func() downloader { return &nixos{} },
        "openeuler-http":       func() downloader { return &openEuler{} },
        "opensuse-http":        func() downloader { return &opensuse{} },
        "openwrt-http":         func() downloader { return &openwrt{} },

YAML:

image:
  distribution: nixos
  release: Tapir

source:
  downloader: nixos-http

targets:
  lxc:
    create_message: |
      You just created an {{ image.description }} container.

    config:
    - type: all
      after: 4
      content: |-
        lxc.include = LXC_TEMPLATE_CONFIG/common.conf

    - type: user
      after: 4
      content: |-
        lxc.include = LXC_TEMPLATE_CONFIG/userns.conf

    - type: all
      content: |-
        lxc.arch = {{ image.architecture_kernel }}

packages:
  custom_manager:
    clean:
      cmd: true
    install:
      cmd: true
    remove:
      cmd: true
    refresh:
      cmd: true
    update:
      cmd: true

files:
  - name: conf-hostname
    path: /etc/nixos/lxd.nix
    generator: template
    content: |-
      { lib, config, pkgs, ... }:

      # WARNING: THIS CONFIGURATION IS AUTOGENERATED AND WILL BE OVERWRITTEN AUTOMATICALLY

      {
        networking.hostName = "{{ instance.name }}";
      }

@stgraber
Copy link
Member

Merged support for the image above, should show up on image servers within a day or so.

@stgraber
Copy link
Member

@adamcstephens so we have a problem with hydra.nixos.org, it's got an IPv6 DNS record but its web server only listens on IPv4. This prevents our builders from connecting to it.

Any chance you can have whoever runs that web server to actually have it listen on IPv6?

@adamcstephens
Copy link
Contributor Author

Yes, I’ll look into it. Thanks.

@adamcstephens
Copy link
Contributor Author

This is a known issue. NixOS/infra#284

Some ideas are being floated on Matrix, and hopefully we can have a resolution soon.

@adamcstephens
Copy link
Contributor Author

ipv6 to hydra.nixos.org is now working.

@stgraber
Copy link
Member

@adamcstephens looks like we do have a build, however it's failing our tests and so won't publish:
https://jenkins.linuxcontainers.org/job/test-image/2179/architecture=amd64/console

@adamcstephens
Copy link
Contributor Author

Is the priv test for a privileged container? I suspect this image doesn't support both unprivileged and privileged.

@stgraber
Copy link
Member

Yeah, it's for security.privileged=true which typically requires some version of our systemd override to work properly.

https://github.com/lxc/distrobuilder/blob/main/distrobuilder/main.go#L609

I guess we'd need to make sure the systemd generator is also included in the image as even for unprivileged containers, the lack of those systemd overrides is going to cause problems for some services.

@adamcstephens
Copy link
Contributor Author

The generator is problematic on NixOS as it makes assumptions that don't apply, so we've ported most of the functionality to NixOS modules. Unfortunately, it requires setting a NixOS configuration during image build in order to enable the equivalent systemd overrides. I could configure a separate image to build for privileged, but that seems less than ideal

The generator on NixOS shouldn't try writing to /etc at all, so runtime config would need to be limited to making changes in /run. I'd be happy to submit some proposed changes if you're interested.

Is there a need for the generator script to be embedded in go code? Or could we move it to its own file so it can be consumed directly?

(Finally can we re-open this? We still have some more steps to go until it's complete)

@adamcstephens
Copy link
Contributor Author

Here's how we're currently accomplishing the major overrides of the generator. I've read through the generator multiple times in the past, but I can't guarantee this is identical. The problem currently in CI is surely the privileged container option at the top, which is off by default.

https://github.com/NixOS/nixpkgs/blob/ea0dcd0ae14b99c5740acc7a1b874ea4446cb5be/nixos/modules/virtualisation/lxc-container.nix#L76-L99

@stgraber stgraber reopened this Oct 19, 2023
@stgraber
Copy link
Member

@adamcstephens where does one set cfg.privilegedContainer?

I could have our image testing script set that option prior to starting the instance which should then take care of the issue.

@adamcstephens
Copy link
Contributor Author

adamcstephens commented Oct 19, 2023

The only way to change this option on the same image would be to:

  1. Boot the container
  2. Modify /etc/nixos/configuration.nix and add virtualisation.lxc.privilegedContainer = true; inside the returned attrset
  3. Run nixos-rebuild switch

The necessary configuration file isn't created until first boot, and its source is the read-only nix store which should be considered an immutable set of folders in the image. (Obviously it's an image and they could be modified, but doing so would undermine the principles of Nix/NixOS)

@stgraber
Copy link
Member

Is that what we'd expect a user wanting a privileged NixOS container to do?

@adamcstephens
Copy link
Contributor Author

No, I'd rather the container dynamically respond to the configuration of lxc/incus. But the current state is how it was when I started working on this. I don't use privileged containers personally so haven't been motivated to fix it.

If the generator script was in a dedicated file (or a subcommand existed to just write the file somewhere), that would allow me to patch it and add it to the images we're building. Resulting images then would be closer to those generated by distrobuilder.

Alternatively I could rewrite the script completely only for NixOS, but doing so would ensure future changes to the generator require manual updating.

I know this is probably a bit foreign if you're not familiar with NixOS. Directories that are commonly used on other distros are either missing (/lib, /usr) or should be treated as read-only (/etc). Configuration is applied to the system from a configuration file, but everything (packages, config files, etc) live in the read-only store. So even something like systemd-generators need to be configured up front in the configuration that is applied (or in this case built into a tarball). I'm happy to provide more detailed information, but unless you want to learn more I'm not trying to trouble you with all the details. :) It has been a long term project of mine to improve lxc, lxd and now incus support for NixOS. Getting these images published in the main image server would be a huge win since users would be able to launch one easily, though NixOS users can already consume and build their own.

@stgraber
Copy link
Member

The initial image has been published now has images:nixos. It's tagged with a restriction to only run it in unprivileged containers, though that restriction still needs to land: lxc/incus#182

I'll have to look at the distrobuilder codebase to extract the systemd generator into a standalone file and then have Go embed its content back into the binary. We started doing that for something else in Incus, so it shouldn't be too hard to do here too.

@stgraber
Copy link
Member

lxc/distrobuilder#779

@adamcstephens
Copy link
Contributor Author

I'm still making progress on this, and have NixOS/nixpkgs#264929 open. It will need to wait a couple weeks due to the current breaking change freeze for the pending 23.11 NixOS release.

@adamcstephens
Copy link
Contributor Author

The container images are now using the generator. I'm not seeing where you put in the the tagging for requirements.privileged, so I can't tell if there's anything that needs to be done to allow for the tests to run again.

fwiw, we are testing the image's generator logic now too. https://github.com/NixOS/nixpkgs/blob/5c1d2b0d06241279e6a70e276397a5e40e867499/nixos/tests/incus/container.nix#L79-L105

@stgraber
Copy link
Member

The check is in test-image in this repo, I'll push a commit to remove it.

@stgraber
Copy link
Member

Done!

@stgraber
Copy link
Member

@adamcstephens are we good to close this issue now?

@adamcstephens
Copy link
Contributor Author

I've opened the two linked PRs, lxc/distrobuilder#798 and #800 . This should finalize container support. I had to add back your guard, but targeted to release 23.11.

The only remaining issue on my list is VM support. We build qcow2 images for these as well. These should have different configurations than the containers, so I'd prefer to serve these directly. This seems to conflict with the distrobuilder model which wants to use a single tarball for both containers and VMs. We can close this as unresolved unless you have some ideas on how to proceed with the VM images.

@stgraber
Copy link
Member

@adamcstephens for those we should be able to cheat, basically have the Jenkins job download the matching .qcow2 directly from you and then move on with the build.

That way the rootfs and metadata will come from distrobuilder but the prebuilt VM image will come directly from you and just be picked up as if distrobuilder built it.

It's not the cleanest thing in the world, but that will still result in distrobuilder-built metadata which should give us what we want here.

@adamcstephens
Copy link
Contributor Author

@stgraber can we remove the nixos/current images from the imageserver? They're not being generated anymore with my update to the names.

We should have the disk image for VMs available for easy consumption from hydra soon.

@stgraber
Copy link
Member

Ah yeah, they'll disappear on their own during auto cleanup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants