Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package container images with the self-extracting installer for The Combine #3240

Merged
merged 21 commits into from
Aug 1, 2024

Conversation

jmgrady
Copy link
Collaborator

@jmgrady jmgrady commented Jul 9, 2024

In order to minimize issues when installing The Combine from the self-extracting installer, the container images and charts for the installation are now included in the installer package instead of downloading them as part of the installing process. This increases the size of the installer but is done to mitigate issues with timeout errors when installing The Combine on a system where the internet connection is slow.

Note that the installer does install some Linux packages, such as containerd.io for the Docker engine. These are not included in the installer but are installed using the system's apt package manager. This is done so that apt will manage keeping the packages up to date.

The installer/make-combine-installer.sh script does support a --net-install option to create an installer without the container images. This is intended as a transitional option as the full installer is validated for the users in areas with poor internet performance.


This change is Reviewable

- add container images for helm charts to the install package
    - cert-manager
    - NGINX ingress controller
    - The Combine
- add a `--net-install` to create installer without the images
# Conflicts:
#	deploy/requirements.txt
#	dev-requirements.txt
#	maintenance/requirements.txt
deploy/scripts/helm_utils.py Dismissed Show dismissed Hide dismissed
deploy/scripts/helm_utils.py Dismissed Show dismissed Hide dismissed
Copy link

codecov bot commented Jul 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.58%. Comparing base (eb23ec1) to head (de215d9).
Report is 35 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3240      +/-   ##
==========================================
- Coverage   74.64%   74.58%   -0.06%     
==========================================
  Files         279      279              
  Lines       10683    10683              
  Branches     1289     1289              
==========================================
- Hits         7974     7968       -6     
- Misses       2349     2353       +4     
- Partials      360      362       +2     
Flag Coverage Δ
backend 83.87% <ø> (-0.13%) ⬇️
frontend 66.43% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jmgrady jmgrady marked this pull request as ready for review July 25, 2024 14:26
Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 34 files at r1, 2 of 9 files at r2, all commit messages.
Reviewable status: 10 of 35 files reviewed, 4 unresolved discussions (waiting on @jmgrady)


deploy/ansible/roles/helm_install/defaults/main.yml line 2 at r2 (raw file):

---
helm_version: v3.15.2

Is helm_version still necessary here, now that it's defined in deploy/ansible/vars/k3s_versions.yml? It'd be a shame to have to keep the version number updated in both places.


installer/make-combine-installer.sh line 39 at r2 (raw file):

if [ -z "${COMBINE_VERSION}" ] ; then
  echo "COMBINE_VERSION is not set."
  exit 1

Could use the new error function here.

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 35 files reviewed, 3 unresolved discussions (waiting on @Github-advanced-security[bot], @imnasnainaec, and @jmgrady)


deploy/ansible/roles/helm_install/defaults/main.yml line 2 at r2 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Is helm_version still necessary here, now that it's defined in deploy/ansible/vars/k3s_versions.yml? It'd be a shame to have to keep the version number updated in both places.

Similar to the subcharts, the helm_version is specified here so that the role does not depend on the playbook that calls it. This is a default value and the playbook chooses to override it. I added the k3s_versions.yml to ensure that the same version is used in the helm_install and container_images roles. I can add a helm_version to the defaults for container_images. (Probably not the solution you wanted).

The other option would be to just decide that our roles are not reusable in other environments since we are not positioned to use them in other projects.


deploy/scripts/helm_utils.py line 32 at r1 (raw file):

Previously, github-advanced-security[bot] wrote…

Clear-text storage of sensitive information

This expression stores sensitive data (secret) as clear text.
This expression stores sensitive data (secret) as clear text.
This expression stores sensitive data (secret) as clear text.

Show more details

Done.


deploy/scripts/helm_utils.py line 38 at r1 (raw file):

Previously, github-advanced-security[bot] wrote…

Clear-text logging of sensitive information

This expression logs sensitive data (secret) as clear text.
This expression logs sensitive data (secret) as clear text.

Show more details

Done.


installer/make-combine-installer.sh line 39 at r2 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Could use the new error function here.

Done.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 35 files reviewed, 6 unresolved discussions (waiting on @Github-advanced-security[bot] and @jmgrady)


deploy/scripts/setup_combine.py line 123 at r3 (raw file):

    if args.list_targets:
        for target in config["targets"].keys():
            print(f"   {target}")

Now that logging is added to this file, do you want to replace all the print() statements?


deploy/scripts/setup_files/cluster_config.yaml line 10 at r3 (raw file):

  rancher:
    - rancher-ui
  cert-mgr:

We don't abbreviate cert-manager as cert-mgr anywhere else, and the three places in our code with certmgr I'm not even sure are still relevant.


installer/make-combine-installer.sh line 53 at r3 (raw file):

  python -m piptools sync requirements.txt

  # Package the so that the Combine can be installed "offline"

Package the ??? so that ....
(and again in another comment below)

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 34 files at r1, 2 of 9 files at r2, all commit messages.
Dismissed @Github-advanced-security[bot] from 2 discussions.
Reviewable status: 17 of 35 files reviewed, 7 unresolved discussions (waiting on @jmgrady)


deploy/scripts/install-combine.sh line 183 at r3 (raw file):

    sleep 1
  done
  while ! ip link show cni0  > /dev/null 2>&1 ; do

Are "flannel.1" and "cni0" enduring enough to hard-code into this function, or should they be passed as an argument array (or default arguments) for a more general function?


deploy/scripts/kube_env.py line 79 at r3 (raw file):

def add_helm_opts(parser: argparse.ArgumentParser) -> None:
    """Add commandline arguments for Helm."""

"command line" or "command-line"


deploy/scripts/package_images.py line 25 at r3 (raw file):

ansible_dir = scripts_dir.parent / "ansible"
helm_dir = scripts_dir.parent / "helm"
config_dir = scripts_dir / "setup_files" / "combine_config.yaml"

config_dir isn't used in this script


deploy/scripts/package_images.py line 177 at r3 (raw file):

    os.environ["AWS_SECRET_ACCESS_KEY"] = ""
    os.environ["AWS_SECRET_ACCESS_KEY"] = ""
    os.environ["AWS_SECRET_ACCESS_KEY"] = ""

I assume the intent is to change the duplicate "AWS_SECRET_ACCESS_KEY" instances here to other AWS_ variable names?

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 17 of 35 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)


deploy/scripts/install-combine.sh line 183 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Are "flannel.1" and "cni0" enduring enough to hard-code into this function, or should they be passed as an argument array (or default arguments) for a more general function?

Yes, they enduring, they are part of the k3s installation. It's still a good suggestion.

Done.


deploy/scripts/kube_env.py line 79 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

"command line" or "command-line"

Done.


deploy/scripts/package_images.py line 25 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

config_dir isn't used in this script

It also isn't a directory. :-(

Done.


deploy/scripts/package_images.py line 177 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I assume the intent is to change the duplicate "AWS_SECRET_ACCESS_KEY" instances here to other AWS_ variable names?

Correct. Fixed.


deploy/scripts/setup_combine.py line 123 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Now that logging is added to this file, do you want to replace all the print() statements?

This is the only print() statement that I see. In this case, it is printing the output of the script when the --list-targets option is used; it is not logging progress or any intermediate data.


deploy/scripts/setup_files/cluster_config.yaml line 10 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

We don't abbreviate cert-manager as cert-mgr anywhere else, and the three places in our code with certmgr I'm not even sure are still relevant.

Done.


installer/make-combine-installer.sh line 53 at r3 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Package the ??? so that ....
(and again in another comment below)

Done.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 34 files at r1, 1 of 9 files at r2, 1 of 3 files at r3, 2 of 5 files at r4, all commit messages.
Reviewable status: 25 of 35 files reviewed, 4 unresolved discussions (waiting on @jmgrady)


deploy/ansible/playbook_desktop_setup.yaml line 50 at r4 (raw file):

      import_role:
        name: helm_install
      when: install_helm

install_helm is still defined in 3 files, though it doesn't look like it's used anymore.


deploy/ansible/playbook_k3s_airgapped_files.yml line 5 at r4 (raw file):

# Playbook: playbook_k3s_airgap.yml
#
# playbook_k3s_airgap.yml downloads and packages the

These two comment instances have _airgap.yml though the file is _airgapped_files.yml.


deploy/ansible/playbook_k3s_airgapped_files.yml line 28 at r4 (raw file):

      file:
        path: "{{ package_dir }}"
        state: directory

No owner:, group:, or mode: needed here?


deploy/ansible/roles/container_images/tasks/main.yml line 19 at r4 (raw file):

    owner: root
    group: root
    mode: 644

Why 3-diget mode 644 here, but 4-digit for most of the file?


deploy/ansible/roles/k8s_install/tasks/k3s.yml line 30 at r4 (raw file):

    owner: root
    group: root
    mode: "0755"

Why is this mode quoted, but not elsewhere?

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 25 of 35 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)


deploy/ansible/playbook_desktop_setup.yaml line 50 at r4 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

install_helm is still defined in 3 files, though it doesn't look like it's used anymore.

Done.

Definition of install_helm removed.


deploy/ansible/playbook_k3s_airgapped_files.yml line 5 at r4 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

These two comment instances have _airgap.yml though the file is _airgapped_files.yml.

Done.


deploy/ansible/playbook_k3s_airgapped_files.yml line 28 at r4 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

No owner:, group:, or mode: needed here?

Correct. It will be the permissions for the user running the installation. Since this is performed on an unknown user's laptop, we do not specify a specific user/group/mode.


deploy/ansible/roles/container_images/tasks/main.yml line 19 at r4 (raw file):
Because I goofed. From https://docs.ansible.com/ansible/latest/collections/ansible/builtin/copy_module.html

For those used to /usr/bin/chmod remember that modes are actually octal numbers. You must either add a leading zero so that Ansible’s YAML parser knows it is an octal number (like 0644 or 01777) or quote it (like '644' or '1777') so Ansible receives a string and can do its own conversion from string into number. Giving Ansible a number without following one of these rules will end up with a decimal number which will have unexpected results.

Done.

Also made other mode attributes use the same form to make them consistent.


deploy/ansible/roles/k8s_install/tasks/k3s.yml line 30 at r4 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Why is this mode quoted, but not elsewhere?

Either is fine. See comment above - was changed to match other mode specifications.

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 21 of 36 files reviewed, 2 unresolved discussions (waiting on @imnasnainaec and @jmgrady)


deploy/ansible/roles/helm_install/defaults/main.yml line 2 at r2 (raw file):

Previously, jmgrady (Jim Grady) wrote…

Similar to the subcharts, the helm_version is specified here so that the role does not depend on the playbook that calls it. This is a default value and the playbook chooses to override it. I added the k3s_versions.yml to ensure that the same version is used in the helm_install and container_images roles. I can add a helm_version to the defaults for container_images. (Probably not the solution you wanted).

The other option would be to just decide that our roles are not reusable in other environments since we are not positioned to use them in other projects.

Also, if we decide to drop the network installation (which is the current plan), the helm_install role will go away and the duplicate definition with it.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 34 files at r1, 1 of 9 files at r2, 5 of 7 files at r5, all commit messages.
Reviewable status: 30 of 36 files reviewed, 1 unresolved discussion (waiting on @jmgrady)


deploy/ansible/roles/container_images/tasks/main.yml line 50 at r5 (raw file):

- name: Unpack helm
  shell:
    cmd: tar xzvf {{ source_image_dir }}/helm.tar.gz -C /opt/helm/{{ helm_version }}

Shouldn't {{ source_image_dir }}/helm.tar.gz be quoted in case of a space in the path?


deploy/scripts/install-combine.sh line 9 at r5 (raw file):

}
error () {
  echo "ERROR: $1" >&2  

There's a line-final double-space after both instances of $1" >&2.


installer/make-combine-installer.sh line 8 at r5 (raw file):

}
error () {
  echo "ERROR: $1" >&2  

There's a line-final double-space after both instances of $1" >&2.

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 30 of 36 files reviewed, all discussions resolved (waiting on @jmgrady)


deploy/ansible/roles/container_images/tasks/main.yml line 50 at r5 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Shouldn't {{ source_image_dir }}/helm.tar.gz be quoted in case of a space in the path?

Done.

Good catch.


deploy/scripts/install-combine.sh line 9 at r5 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

There's a line-final double-space after both instances of $1" >&2.

Done.


installer/make-combine-installer.sh line 8 at r5 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

There's a line-final double-space after both instances of $1" >&2.

Done.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r6, all commit messages.
Reviewable status: 31 of 36 files reviewed, 1 unresolved discussion (waiting on @jmgrady)


deploy/ansible/roles/k8s_install/tasks/k3s.yml line 36 at r6 (raw file):

    cmd: >
      curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key
      | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

Is it worth adding a variable for /etc/apt/keyrings/kubernetes-apt-keyring.gpg since it's referenced four times?


installer/README.md line 176 at r6 (raw file):

| clean           | Remove the previously saved environment (AWS Access Key, admin user info) before performing the installation.                                                                                                    |
| restart         | Run the installation from the beginning; do not resume a previous installation.                                                                                                                                  |
| timeout TIMEOUT | Use a different timeout when installing. The default timeout is 5 minutes. With slow internet connections, it is helpful to extend the timeout. See <https://pkg.go.dev/time#ParseDuration> for timeout formats. |

What about the single-step and start-at options?

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 30 of 36 files reviewed, 2 unresolved discussions (waiting on @jmgrady)


README.md line 531 at r6 (raw file):

```console
cd installer
./make-combine-installer.sh combine-release-number

The new --net-install argument lacks documentation.


deploy/ansible/roles/k8s_install/tasks/main.yml line 45 at r6 (raw file):

      group: root

- name: Copy /etc/rancher/k3s/k3s.yaml to .kube/config

Is /etc/rancher/k3s/k3s.yaml worth a variable, since it's referenced four times?


deploy/scripts/package_images.py line 67 at r6 (raw file):

            "playbook_k3s_airgapped_files.yml",
            "--extra-vars",
            f"package_dir={str(dest_dir)}",

I think the str() is redundant within an f string's {}, but it can remain if you prefer.

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 30 of 36 files reviewed, 1 unresolved discussion (waiting on @imnasnainaec and @jmgrady)


README.md line 531 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

The new --net-install argument lacks documentation.

Done.

Note that the intent is to remove the --net-install option in the near future.


deploy/ansible/roles/k8s_install/tasks/k3s.yml line 36 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Is it worth adding a variable for /etc/apt/keyrings/kubernetes-apt-keyring.gpg since it's referenced four times?

Done (ish)

I defined a variable keyring_location to be /etc/apt/keyrings in ./deploy/ansible/roles/k8s_install/defaults/main.yml. I did not create a variable for the file itself since I did not think the additional level of indirection would provide any value: 1) the location is more likely the change than the filename, and 2) all the references are in a single file.

I made a similar for the docker.gpg key in the container_engine role.


deploy/ansible/roles/k8s_install/tasks/main.yml line 45 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Is /etc/rancher/k3s/k3s.yaml worth a variable, since it's referenced four times?

I don't think so. This location is defined by Rancher and is unlikely to change. In addition, it is only referenced in this file.


deploy/scripts/package_images.py line 67 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

I think the str() is redundant within an f string's {}, but it can remain if you prefer.

Done.


installer/README.md line 176 at r6 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

What about the single-step and start-at options?

I added comments to the script itself. These options are for debugging and should only be used by those who download the installer when so directed by a developer supporting them.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 34 files at r1, 1 of 9 files at r2, 1 of 5 files at r6, 7 of 7 files at r7, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)


deploy/scripts/package_images.py line 85 at r7 (raw file):

    run_cmd(container_cli + ["save"] + image_list + ["-o", str(tar_file)])
    # Compress the tarball
    run_cmd(["zstd", "--rm", "--force", "--quiet", str(tar_file)])

Any chance you would want this "--quiet" to depend on the logging level?

Copy link
Collaborator Author

@jmgrady jmgrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)


deploy/scripts/package_images.py line 85 at r7 (raw file):

Previously, imnasnainaec (D. Ror.) wrote…

Any chance you would want this "--quiet" to depend on the logging level?

No. Removing the --quiet adds a frequent messages stating the % complete. The default use will be in a github action and I don't want to have to disable more useful logging to prevent these types of messages adding a lot of noise to the output.

Copy link
Collaborator

@imnasnainaec imnasnainaec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @jmgrady)

@jmgrady jmgrady enabled auto-merge (squash) August 1, 2024 14:05
@jmgrady jmgrady merged commit 3ccc159 into master Aug 1, 2024
19 checks passed
@jmgrady jmgrady deleted the low-bw-install branch August 1, 2024 14:11
@jmgrady jmgrady restored the low-bw-install branch August 2, 2024 14:26
@jmgrady jmgrady deleted the low-bw-install branch August 2, 2024 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants