Magic castle recipes for various workshops
The directory common contains most of the Magic Castle configuration in the form of a common.tf
file which defines
the typical openstack
and dns
modules. It however defines it in a way that is customizable through Terraform Locals.
The directory also contains a config.yaml
file which contains useful defaults that don't need to be redefined in specific workshop directories. Finally, it contains a sshkeys.pub
file
which contains public SSH keys which will be injected into the resulting clusters.
The directory example contains an example of a cluster in which we would redefine many of the parameters of a cluster.
Other directories using a course code are for specific workshops. In each of these directories, there must be a symbolic link to ../common/common.tf
, which will include the configuration
as define in that file. Make sure that this is a symbolic link, and not a copy, or your configuration will diverge.
The main.tf
file that you create for the course should have the following structure:
locals {
name = "course_code"
custom ={
# parameters which you want to customize
}
}
In most cases, the only terraform locals which you want to customize will be
Parameter | Description | Default value |
---|---|---|
nnode_cpu |
Number of static CPU nodes in the cluster | 2 |
nnode_cpupool |
Maximum number of CPU nodes that can be booted dynamically by Slurm | 0 |
nnode_gpu |
Number of static GPU nodes in the cluster | 0 |
nnode_gpupool |
Maximum number of GPU nodes that can be booted dynamically by Slurm | 0 |
home_size |
Size of the /home filesystem in GB | 20 |
project_size |
Size of the /project filesystem in GB | 20 |
scratch_size |
Size of the /scratch filesystem in GB | 20 |
A typical configuration might therefore be as simple as:
locals {
name = "example"
custom = {
nnode_cpu = 5
nnode_gpupool = 2
home_size = 50
}
}
In some rare cases, you may want to customize the following local variables:
Parameter | Description | Default value |
---|---|---|
image |
Default image to use for login and management VMs | Rocky-8 |
image_cpu |
Default image to use for CPU node VMs | snapshot-cpunode-2024-R810.4 |
image_gpu |
Default image to use for CPU node VMs | snapshot-gpunode-2024-R810.4 |
nnode_compute |
Number of CPU nodes of a second type | 0 |
config_version |
Version of puppet-magic_castle to use | see common.tf |
gpu_mig_config |
MIG configuration used for static GPU nodes (for Béluga-cloud only) | { "3g.20gb" = 1, "2g.10gb" = 1, "1g.5gb" = 2 } |
gpupool_mig_config |
MIG configuration used for static GPU nodes (for Béluga-cloud only) | { "1g.5gb" = 7 } |
The common.tf
file also defines a map of flavours of virtual machines to be used, for Arbutus and Béluga. This map can be completely or partially redefined. Default values are
instances_type_map = {
arbutus = {
mgmt = "p8-12gb"
login = "p4-6gb"
cpu = "c8-30gb-186-avx2"
cpupool = "c8-30gb-186-avx2"
compute_node = "p8-12gb"
gpu = "g1-8gb-c4-22gb"
gpupool = "g1-8gb-c4-22gb"
}
beluga = {
mgmt = "p4-7.5gb"
login = "p4-7.5gb"
cpu = "c8-60gb"
cpupool = "c8-60gb"
compute_node = "p8-15gb"
gpu = "gpu32-240-3450gb-a100x1"
gpupool = "gpu32-240-3450gb-a100x1"
}
}
For example, if you simply want to change the type of CPU node for Béluga, you could define:
locals {
...
custom = {
instances_type_map = {
beluga = {
cpu = "c4-30gb"
}
}
}
}
If the above customizations are not enough, you can also completely redefine the instances
or the volumes
variable. See here
for an example.
Any parameter specified in the config.yaml
file inside of the course directory will override the default value that may be defined in the common
directory. For a sampling of some useful puppet parameters, see
in the example folder. For an exhaustive list, see the Magic Castle documentation.
If you use a dynamic pool of node, you will want to configure a Terraform variable of type HCL, named pool
, with value []
.
The name of the cluster will be based on the name
variable as defined in the locals
. If you want, you can define a Terraform variable named suffix
, which will be appended to the name
.
This may be useful if you want to start two clusters, one in Arbutus, one in Béluga-cloud, using the same configuration.
One can validate and lint the YAML configuration files locally by using yamllint
and the provided configuration and hook.
The hook is ran before a commit operation and checks files that are staged for commit.
-
Install
yamllint
locally with your preferred method (pip
, system package manager, ...). For installation instructions, see quickstart -
Copy the hook and ensure it is executable
cd $(git rev-parse --show-toplevel)
cp -v .templates/hooks/pre-commit .git/hooks/
or create a symbolic link:
cd $(git rev-parse --show-toplevel)
ln -rs .templates/hooks/pre-commit .git/hooks/pre-commit
A Github action runs on every push or pull request that modifies a yaml file. One can see the different runs and details for a particular commit or under the Actions tab.