-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package container images with the self-extracting installer for The Combine #3240
Conversation
- add container images for helm charts to the install package - cert-manager - NGINX ingress controller - The Combine - add a `--net-install` to create installer without the images
# Conflicts: # deploy/requirements.txt # dev-requirements.txt # maintenance/requirements.txt
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3240 +/- ##
==========================================
- Coverage 74.64% 74.58% -0.06%
==========================================
Files 279 279
Lines 10683 10683
Branches 1289 1289
==========================================
- Hits 7974 7968 -6
- Misses 2349 2353 +4
- Partials 360 362 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
cert-manager is not needed for NUCs or the offline development and it is installed by LTOps for the QA and Production environments. It is only used for development clusters, e.g. on Rancher Desktop or Docker Desktop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 34 files at r1, 2 of 9 files at r2, all commit messages.
Reviewable status: 10 of 35 files reviewed, 4 unresolved discussions (waiting on @jmgrady)
deploy/ansible/roles/helm_install/defaults/main.yml
line 2 at r2 (raw file):
--- helm_version: v3.15.2
Is helm_version
still necessary here, now that it's defined in deploy/ansible/vars/k3s_versions.yml
? It'd be a shame to have to keep the version number updated in both places.
installer/make-combine-installer.sh
line 39 at r2 (raw file):
if [ -z "${COMBINE_VERSION}" ] ; then echo "COMBINE_VERSION is not set." exit 1
Could use the new error
function here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 35 files reviewed, 3 unresolved discussions (waiting on @Github-advanced-security[bot], @imnasnainaec, and @jmgrady)
deploy/ansible/roles/helm_install/defaults/main.yml
line 2 at r2 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Is
helm_version
still necessary here, now that it's defined indeploy/ansible/vars/k3s_versions.yml
? It'd be a shame to have to keep the version number updated in both places.
Similar to the subcharts, the helm_version
is specified here so that the role does not depend on the playbook that calls it. This is a default value and the playbook chooses to override it. I added the k3s_versions.yml
to ensure that the same version is used in the helm_install
and container_images
roles. I can add a helm_version
to the defaults for container_images
. (Probably not the solution you wanted).
The other option would be to just decide that our roles are not reusable in other environments since we are not positioned to use them in other projects.
deploy/scripts/helm_utils.py
line 32 at r1 (raw file):
Previously, github-advanced-security[bot] wrote…
Clear-text storage of sensitive information
This expression stores sensitive data (secret) as clear text.
This expression stores sensitive data (secret) as clear text.
This expression stores sensitive data (secret) as clear text.
Done.
deploy/scripts/helm_utils.py
line 38 at r1 (raw file):
Previously, github-advanced-security[bot] wrote…
Clear-text logging of sensitive information
This expression logs sensitive data (secret) as clear text.
This expression logs sensitive data (secret) as clear text.
Done.
installer/make-combine-installer.sh
line 39 at r2 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Could use the new
error
function here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 35 files reviewed, 6 unresolved discussions (waiting on @Github-advanced-security[bot] and @jmgrady)
deploy/scripts/setup_combine.py
line 123 at r3 (raw file):
if args.list_targets: for target in config["targets"].keys(): print(f" {target}")
Now that logging is added to this file, do you want to replace all the print()
statements?
deploy/scripts/setup_files/cluster_config.yaml
line 10 at r3 (raw file):
rancher: - rancher-ui cert-mgr:
We don't abbreviate cert-manager
as cert-mgr
anywhere else, and the three places in our code with certmgr
I'm not even sure are still relevant.
installer/make-combine-installer.sh
line 53 at r3 (raw file):
python -m piptools sync requirements.txt # Package the so that the Combine can be installed "offline"
Package the ??? so that ....
(and again in another comment below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 34 files at r1, 2 of 9 files at r2, all commit messages.
Dismissed @Github-advanced-security[bot] from 2 discussions.
Reviewable status: 17 of 35 files reviewed, 7 unresolved discussions (waiting on @jmgrady)
deploy/scripts/install-combine.sh
line 183 at r3 (raw file):
sleep 1 done while ! ip link show cni0 > /dev/null 2>&1 ; do
Are "flannel.1" and "cni0" enduring enough to hard-code into this function, or should they be passed as an argument array (or default arguments) for a more general function?
deploy/scripts/kube_env.py
line 79 at r3 (raw file):
def add_helm_opts(parser: argparse.ArgumentParser) -> None: """Add commandline arguments for Helm."""
"command line" or "command-line"
deploy/scripts/package_images.py
line 25 at r3 (raw file):
ansible_dir = scripts_dir.parent / "ansible" helm_dir = scripts_dir.parent / "helm" config_dir = scripts_dir / "setup_files" / "combine_config.yaml"
config_dir
isn't used in this script
deploy/scripts/package_images.py
line 177 at r3 (raw file):
os.environ["AWS_SECRET_ACCESS_KEY"] = "" os.environ["AWS_SECRET_ACCESS_KEY"] = "" os.environ["AWS_SECRET_ACCESS_KEY"] = ""
I assume the intent is to change the duplicate "AWS_SECRET_ACCESS_KEY"
instances here to other AWS_
variable names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 35 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)
deploy/scripts/install-combine.sh
line 183 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Are "flannel.1" and "cni0" enduring enough to hard-code into this function, or should they be passed as an argument array (or default arguments) for a more general function?
Yes, they enduring, they are part of the k3s
installation. It's still a good suggestion.
Done.
deploy/scripts/kube_env.py
line 79 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
"command line" or "command-line"
Done.
deploy/scripts/package_images.py
line 25 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
config_dir
isn't used in this script
It also isn't a directory. :-(
Done.
deploy/scripts/package_images.py
line 177 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
I assume the intent is to change the duplicate
"AWS_SECRET_ACCESS_KEY"
instances here to otherAWS_
variable names?
Correct. Fixed.
deploy/scripts/setup_combine.py
line 123 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Now that logging is added to this file, do you want to replace all the
print()
statements?
This is the only print()
statement that I see. In this case, it is printing the output of the script when the --list-targets
option is used; it is not logging progress or any intermediate data.
deploy/scripts/setup_files/cluster_config.yaml
line 10 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
We don't abbreviate
cert-manager
ascert-mgr
anywhere else, and the three places in our code withcertmgr
I'm not even sure are still relevant.
Done.
installer/make-combine-installer.sh
line 53 at r3 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Package the ??? so that ....
(and again in another comment below)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 34 files at r1, 1 of 9 files at r2, 1 of 3 files at r3, 2 of 5 files at r4, all commit messages.
Reviewable status: 25 of 35 files reviewed, 4 unresolved discussions (waiting on @jmgrady)
deploy/ansible/playbook_desktop_setup.yaml
line 50 at r4 (raw file):
import_role: name: helm_install when: install_helm
install_helm
is still defined in 3 files, though it doesn't look like it's used anymore.
deploy/ansible/playbook_k3s_airgapped_files.yml
line 5 at r4 (raw file):
# Playbook: playbook_k3s_airgap.yml # # playbook_k3s_airgap.yml downloads and packages the
These two comment instances have _airgap.yml
though the file is _airgapped_files.yml
.
deploy/ansible/playbook_k3s_airgapped_files.yml
line 28 at r4 (raw file):
file: path: "{{ package_dir }}" state: directory
No owner:
, group:
, or mode:
needed here?
deploy/ansible/roles/container_images/tasks/main.yml
line 19 at r4 (raw file):
owner: root group: root mode: 644
Why 3-diget mode 644
here, but 4-digit for most of the file?
deploy/ansible/roles/k8s_install/tasks/k3s.yml
line 30 at r4 (raw file):
owner: root group: root mode: "0755"
Why is this mode quoted, but not elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 25 of 35 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)
deploy/ansible/playbook_desktop_setup.yaml
line 50 at r4 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
install_helm
is still defined in 3 files, though it doesn't look like it's used anymore.
Done.
Definition of install_helm
removed.
deploy/ansible/playbook_k3s_airgapped_files.yml
line 5 at r4 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
These two comment instances have
_airgap.yml
though the file is_airgapped_files.yml
.
Done.
deploy/ansible/playbook_k3s_airgapped_files.yml
line 28 at r4 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
No
owner:
,group:
, ormode:
needed here?
Correct. It will be the permissions for the user running the installation. Since this is performed on an unknown user's laptop, we do not specify a specific user/group/mode.
deploy/ansible/roles/container_images/tasks/main.yml
line 19 at r4 (raw file):
Because I goofed. From https://docs.ansible.com/ansible/latest/collections/ansible/builtin/copy_module.html
For those used to
/usr/bin/chmod
remember that modes are actually octal numbers. You must either add a leading zero so that Ansible’s YAML parser knows it is an octal number (like0644
or01777
) or quote it (like'644'
or'1777'
) so Ansible receives a string and can do its own conversion from string into number. Giving Ansible a number without following one of these rules will end up with a decimal number which will have unexpected results.
Done.
Also made other mode attributes use the same form to make them consistent.
deploy/ansible/roles/k8s_install/tasks/k3s.yml
line 30 at r4 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Why is this mode quoted, but not elsewhere?
Either is fine. See comment above - was changed to match other mode specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 21 of 36 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)
deploy/ansible/roles/helm_install/defaults/main.yml
line 2 at r2 (raw file):
Previously, jmgrady (Jim Grady) wrote…
Similar to the subcharts, the
helm_version
is specified here so that the role does not depend on the playbook that calls it. This is a default value and the playbook chooses to override it. I added thek3s_versions.yml
to ensure that the same version is used in thehelm_install
andcontainer_images
roles. I can add ahelm_version
to the defaults forcontainer_images
. (Probably not the solution you wanted).The other option would be to just decide that our roles are not reusable in other environments since we are not positioned to use them in other projects.
Also, if we decide to drop the network installation (which is the current plan), the helm_install
role will go away and the duplicate definition with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 34 files at r1, 1 of 9 files at r2, 5 of 7 files at r5, all commit messages.
Reviewable status: 30 of 36 files reviewed, 1 unresolved discussion (waiting on @jmgrady)
deploy/ansible/roles/container_images/tasks/main.yml
line 50 at r5 (raw file):
- name: Unpack helm shell: cmd: tar xzvf {{ source_image_dir }}/helm.tar.gz -C /opt/helm/{{ helm_version }}
Shouldn't {{ source_image_dir }}/helm.tar.gz
be quoted in case of a space in the path?
deploy/scripts/install-combine.sh
line 9 at r5 (raw file):
} error () { echo "ERROR: $1" >&2
There's a line-final double-space after both instances of $1" >&2
.
installer/make-combine-installer.sh
line 8 at r5 (raw file):
} error () { echo "ERROR: $1" >&2
There's a line-final double-space after both instances of $1" >&2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 30 of 36 files reviewed, all discussions resolved (waiting on @jmgrady)
deploy/ansible/roles/container_images/tasks/main.yml
line 50 at r5 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Shouldn't
{{ source_image_dir }}/helm.tar.gz
be quoted in case of a space in the path?
Done.
Good catch.
deploy/scripts/install-combine.sh
line 9 at r5 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
There's a line-final double-space after both instances of
$1" >&2
.
Done.
installer/make-combine-installer.sh
line 8 at r5 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
There's a line-final double-space after both instances of
$1" >&2
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 5 files at r6, all commit messages.
Reviewable status: 31 of 36 files reviewed, 1 unresolved discussion (waiting on @jmgrady)
deploy/ansible/roles/k8s_install/tasks/k3s.yml
line 36 at r6 (raw file):
cmd: > curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
Is it worth adding a variable for /etc/apt/keyrings/kubernetes-apt-keyring.gpg
since it's referenced four times?
installer/README.md
line 176 at r6 (raw file):
| clean | Remove the previously saved environment (AWS Access Key, admin user info) before performing the installation. | | restart | Run the installation from the beginning; do not resume a previous installation. | | timeout TIMEOUT | Use a different timeout when installing. The default timeout is 5 minutes. With slow internet connections, it is helpful to extend the timeout. See <https://pkg.go.dev/time#ParseDuration> for timeout formats. |
What about the single-step
and start-at
options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 30 of 36 files reviewed, 2 unresolved discussions (waiting on @jmgrady)
README.md
line 531 at r6 (raw file):
```console cd installer ./make-combine-installer.sh combine-release-number
The new --net-install
argument lacks documentation.
deploy/ansible/roles/k8s_install/tasks/main.yml
line 45 at r6 (raw file):
group: root - name: Copy /etc/rancher/k3s/k3s.yaml to .kube/config
Is /etc/rancher/k3s/k3s.yaml
worth a variable, since it's referenced four times?
deploy/scripts/package_images.py
line 67 at r6 (raw file):
"playbook_k3s_airgapped_files.yml", "--extra-vars", f"package_dir={str(dest_dir)}",
I think the str()
is redundant within an f string's {}
, but it can remain if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 30 of 36 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @jmgrady)
README.md
line 531 at r6 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
The new
--net-install
argument lacks documentation.
Done.
Note that the intent is to remove the --net-install
option in the near future.
deploy/ansible/roles/k8s_install/tasks/k3s.yml
line 36 at r6 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Is it worth adding a variable for
/etc/apt/keyrings/kubernetes-apt-keyring.gpg
since it's referenced four times?
Done (ish)
I defined a variable keyring_location
to be /etc/apt/keyrings
in ./deploy/ansible/roles/k8s_install/defaults/main.yml
. I did not create a variable for the file itself since I did not think the additional level of indirection would provide any value: 1) the location is more likely the change than the filename, and 2) all the references are in a single file.
I made a similar for the docker.gpg
key in the container_engine
role.
deploy/ansible/roles/k8s_install/tasks/main.yml
line 45 at r6 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Is
/etc/rancher/k3s/k3s.yaml
worth a variable, since it's referenced four times?
I don't think so. This location is defined by Rancher and is unlikely to change. In addition, it is only referenced in this file.
deploy/scripts/package_images.py
line 67 at r6 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
I think the
str()
is redundant within an f string's{}
, but it can remain if you prefer.
Done.
installer/README.md
line 176 at r6 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
What about the
single-step
andstart-at
options?
I added comments to the script itself. These options are for debugging and should only be used by those who download the installer when so directed by a developer supporting them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 34 files at r1, 1 of 9 files at r2, 1 of 5 files at r6, 7 of 7 files at r7, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)
deploy/scripts/package_images.py
line 85 at r7 (raw file):
run_cmd(container_cli + ["save"] + image_list + ["-o", str(tar_file)]) # Compress the tarball run_cmd(["zstd", "--rm", "--force", "--quiet", str(tar_file)])
Any chance you would want this "--quiet" to depend on the logging level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)
deploy/scripts/package_images.py
line 85 at r7 (raw file):
Previously, imnasnainaec (D. Ror.) wrote…
Any chance you would want this "--quiet" to depend on the logging level?
No. Removing the --quiet
adds a frequent messages stating the % complete. The default use will be in a github action and I don't want to have to disable more useful logging to prevent these types of messages adding a lot of noise to the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)
In order to minimize issues when installing The Combine from the self-extracting installer, the container images and charts for the installation are now included in the installer package instead of downloading them as part of the installing process. This increases the size of the installer but is done to mitigate issues with timeout errors when installing The Combine on a system where the internet connection is slow.
Note that the installer does install some Linux packages, such as
containerd.io
for the Docker engine. These are not included in the installer but are installed using the system'sapt
package manager. This is done so thatapt
will manage keeping the packages up to date.The
installer/make-combine-installer.sh
script does support a--net-install
option to create an installer without the container images. This is intended as a transitional option as the full installer is validated for the users in areas with poor internet performance.This change is