This is the declarative deployment of the holochain CI infrastructure. All hosts are running either Linux or MacOS.
The linux hosts are managed via NixOS and the MacOS hosts are managed via nix-darwin.
For making changes to the nixos configuration files, please refer to the nixos manual.
For making changes to the macos configuration files, please refer to the nix-darwin manual.
This repository uses nix flakes. To interact with it, add the experimental features flakes
and nix-command
to your ~/.config/nix/nix.conf
:
experimental-features = flakes nix-command
Flakes have a standardized output schema, for which a good overview exists in the nixos wiki.
Before getting started, it is always a good idea to inspect the outputs of the current project:
nix flake show
Notice the field nixosConfigurations
which lists all hosts managed by this repo.
To change the definition of some attribute seen in nix flake show
, adapt the files under ./modules/flake-parts
. The file and directory names seen in the flake-parts
directory are similar to the flake output names.
Example: The definition for nixosConfigurations.linux-builder-01
is located at ./modules/flake-parts/nixosConfigurations.linux-builder-01
.
Code shared between hosts should be factored out into one or more nixos modules located under ./modules/nixos/
.
This project splits up its nix code into modules in order to improve maintainability and portability of the code.
There are two kinds of modules:
There are two kinds of modules:
- nixos modules: located under
./modules/nixos
- contain configuration for linux or macos hosts.
- flake-parts modules: located under
./modules/flake-parts
- export entities like packages, apps, or machine configurations via the flake outputs, the projects
public API
so to speak. - are responsible for everything seen in the output of
nix flake show
.
- export entities like packages, apps, or machine configurations via the flake outputs, the projects
A new flake module can be added by creating a new file under ./modules/flake-parts
. (Or a new directory containing a default.nix
file)
A template for initializing new modules is located under ./modules/flake-parts/_template.nix
.
For more information on how to write flake modules, visit flake.parts.
After making changes to the configuration files of a host, a flake app must be executed in order to apply the changes to that host.
nix flake show
notice apps prefixed with ssh-
, git-push-
, and deploy-
.
Prerequisites:
- all relevant changes are committed to the current branch.
git push
access to holochain/holochain-infra- authorized key for the
deployUser
on the remote host
The first command will push the current git HEAD to the origin
git remote at a branch specific to the hostname.
The second command will cause a nixos-rebuild switch ...
on the host from its branch.
nix run .\#git-push-{hostname}
nix run .#deploy-{hostname}
These scripts also have arguments for rudimentary customization.
Here, it pushes to the git remote called upstream
, and then runs a build
(instead of a switch
) on the remote host:
nix run .\#git-push-sbd-0_main_infra_holo_host upstream
nix run .\#deploy-sbd-0_main_infra_holo_host build
nix flake update
If a runner appears offline in the runners settings page we can inspect and potentially restart the service.
In the root of this repo:
nix run .\#ssh-linux-builder-01
# on the remote host, we can inspect the status of a runner that appears problematic on GitHub, e.g.
systemctl status github-runner-multi-arch-0.service
Before debugging too much, restarting one, or all as shown below, is worth attempting:
systemctl restart github-runner-multi-arch-*.service
If a runner appears offline in the runners settings page and when SSHing into the host with nix run .\#ssh-linux-builder-01
and checking journalctl -b0 --unit "github*" -n 200
shows that the runner can't start because the runner software is deprecated then try the following steps.
(optional) Update the flake.nix
by pointing nixpkgs
and home-manager.url
at the latest stable release branch. These two must match since home-manager
also references nixpkgs
.
Update the flake.lock
based on your changes
nix flake lock --update-input nixpkgsGithubActionRunners
If that works, check that everything builds successfully on the builder
nix run .#deploy-linux-builder-01 build
If that also works then ask nixos to apply these changes without updating the default profile
nix run .#deploy-linux-builder-01 test
If the runners are back online then update the default profile
nix run .#deploy-linux-builder-01 switch
-
[Adding a new machine]
-
Double-check that all TURN/Signal servers are ready:
nix run .#turn-readiness-check
The error message
journal rollforward failed: journal out of sync with zone
This can be fixed by removing the journal files.
Upon connection via SSH (nix run .\#ssh-dweb-reverse-tls-proxy
):
systemctl stop bind
rm /etc/bind/zones/infra.holochain.org.zone.jnl
systemctl start bind