Skip to content

Commit

Permalink
Merge branch 'main' into google-cloud-python
Browse files Browse the repository at this point in the history
  • Loading branch information
marcelovilla authored Nov 13, 2024
2 parents 4e644be + 859ee45 commit 2d4e23d
Show file tree
Hide file tree
Showing 6 changed files with 304 additions and 4 deletions.
97 changes: 97 additions & 0 deletions docs/docs/explanations/advanced-provider-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,103 @@ amazon_web_services:
permissions_boundary: arn:aws:iam::01234567890:policy/<permissions-boundary-policy-name>
```

### EKS KMS ARN (Optional)

You can use AWS Key Management Service (KMS) to enhance security by encrypting Kubernetes secrets in
Amazon Elastic Kubernetes Service (EKS). This approach adds an extra layer of protection for sensitive
information, like passwords, credentials, and TLS keys, by applying user-managed encryption keys to Kubernetes
secrets, supporting a [defense-in-depth strategy](https://aws.amazon.com/blogs/containers/using-eks-encryption-provider-support-for-defense-in-depth/).

Nebari supports setting an existing KMS key while deploying Nebari to implement encryption of secrets
created in Nebari's EKS cluster. The KMS key must be a **Symmetric** key set to **encrypt and decrypt** data.

:::warning
Enabling EKS cluster secrets encryption, by setting `amazon_web_services.eks_kms_arn`, is an
_irreversible_ action and re-deploying Nebari to try to remove a previously set `eks_kms_arn` will fail.
On the other hand, if you try to change the KMS key in use for cluster encryption, by re-deploying Nebari
after setting a _different_ key ARN, the re-deploy should succeed but the KMS key used for encryption will
not actually change in the cluster config and the original key will remain set. The integrity of a faulty
deployment can be restored, following a failed re-deploy attempt to remove a previously set KMS key, by
simply re-deploying Nebari while ensuring `eks_kms_arn` is set to the original KMS key ARN.
:::

:::danger
If the KMS key used for envelope encryption of secrets is ever deleted, then there is no way to recover
the EKS cluster.
:::

:::note
After enabling cluster encryption on your cluster, you must encrypt all existing secrets with the
new key by running the following command:
`kubectl get secrets --all-namespaces -o json | kubectl annotate --overwrite -f - kms-encryption-timestamp="time value"`
Consult [Encrypt K8s secrets with AWS KMS on existing clusters](https://docs.aws.amazon.com/eks/latest/userguide/enable-kms.html) for more information.
:::

Here is an example of how you would set KMS key ARN in `nebari-config.yaml`.

```yaml
amazon_web_services:
# the arn for the AWS Key Management Service key
eks_kms_arn: "arn:aws:kms:us-west-2:01234567890:key/<aws-kms-key-id>"
```

### Launch Templates (Optional)

Nebari supports configuring launch templates for your node groups, enabling you to customize settings like the AMI ID and pre-bootstrap commands. This is particularly useful if you need to use a custom AMI or perform specific actions before the node joins the cluster.

:::warning
If you add a `launch_template` to an existing node group that was previously created without one, AWS will treat this as a change requiring the replacement of the entire node group. This action will trigger a reallocation of resources, effectively destroying the current node group and recreating it. This behavior is due to how AWS handles self-managed node groups versus those using launch templates with custom settings.
:::

:::tip
To avoid unexpected downtime or data loss, consider creating a new node group with the launch template settings and migrating your workloads accordingly. This approach allows you to implement the new configuration without disrupting your existing resources.
:::

#### Configuring a Launch Template

To configure a launch template for a node group in your `nebari-config.yaml`, add the `launch_template` section under the desired node group:

```yaml
amazon_web_services:
region: us-west-2
kubernetes_version: "1.18"
node_groups:
custom-node-group:
instance: "m5.large"
min_nodes: 1
max_nodes: 5
gpu: false # Set to true if using GPU instances
launch_template:
# Replace with your custom AMI ID
ami_id: ami-0abcdef1234567890
# Command to run before the node joins the cluster
pre_bootstrap_command: |
#!/bin/bash
# This script is executed before the node is bootstrapped
# You can use this script to install additional packages or configure the node
# For example, to install the `htop` package, you can run:
# sudo apt-get update
# sudo apt-get install -y htop"
```

**Parameters:**

- `ami_id` (Optional): The ID of the custom AMI to use for the nodes in this group; this assumes the AMI provided is an EKS-optimized AMI derivative. If specified, the `ami_type` is automatically set to `CUSTOM`.
- `pre_bootstrap_command` (Optional): A command or script to execute on the node before
it joins the Kubernetes cluster. This can be used for custom setup or configuration
tasks. The format should be a single string in conformation with the shell syntax.
This command is injected in the `user_data` field of the launch template. For more
information, see [User Data](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html).

> If you're using a `launch_template` with a custom `ami_id`, there's an issue with updating the `scaling.desired_size` via Nebari configuration (terraform). To scale up, you must recreate the node group or adjust the scaling settings directly in the AWS Console UI (recommended). We are aware of this inconsistency and plan to address it in a future update.
:::note
If an `ami_id` is not provided, AWS will use the default Amazon Linux 2 AMI for the
specified instance type. You can find the latest optimized AMI IDs for Amazon EKS in your
cluster region by inspecting its respective SSM parameters. For more information, see
[Retrieve recommended Amazon Linux AMI IDs](https://docs.aws.amazon.com/eks/latest/userguide/retrieve-ami-id.html).
:::

</TabItem>

<TabItem value="azure" label="Azure">
Expand Down
4 changes: 3 additions & 1 deletion docs/docs/references/RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ This file is copied to nebari-dev/nebari-docs using a GitHub Action. -->

---

## Release 2024.9.1 - September 27, 2024
## Release 2024.9.1 - September 27, 2024 (Broken Release)

> WARNING: This release was later found to have unresolved issues described further in [issue 2798](https://github.com/nebari-dev/nebari/issues/2798). We have marked this release as broken on conda-forge and yanked it on PyPI. One of the bugs prevents any upgrade from 2024.9.1 to 2024.11.1. Users should skip this release entirely and upgrade directly from 2024.7.1 to 2024.11.1.
> WARNING: This release changes how group directories are mounted in JupyterLab pods: only groups with specific permissions will have their directories mounted. If you rely on custom group mounts, we strongly recommend running `nebari upgrade` before updating. This will prompt you to confirm how Nebari should handle your groups—either keep them mounted or allow unmounting. **No data will be lost**, and you can reverse this anytime.
Expand Down
138 changes: 138 additions & 0 deletions docs/docs/references/container-sources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
## Deploying and Running Nebari from a Private Container Repository

Nebari deploys and runs FOSS components as containers running in Kubernetes.
By default, Nebari sources each container from the container's respective public repository, typically `docker.io` or `quay.io`.
This introduces supply-chain concerns for security-focused customers.

One solution to these supply-chain concerns is to deploy Nebari from private locally-mirrored containers:

- Create a controlled private container repository (e.g. ECR)
- Mirror all containers used by Nebari into this private container repository
- Use the `pre_bootstrap_command` mechanism in `nebari-config.yaml` to specify the mirrored container repo

Deploying Nebari in this fashion eliminates significant supply chain surface-area, but requires identifying all containers used by Nebari.

The following configurations demonstrate how to specify a private repo denoted by the string `[PRIVATE_REPO]`.

**Note:** Authorization tokens are used in the examples below. It is important for administrators to understand the expiration policy of these tokens, because the Nebari k8s cluster may in some cases need to **use these tokens to pull container images at any time during run-time operation**.

### Set ECR as default container registry mirror

```
amazon_web_services:
node_groups:
general:
instance: m5.2xlarge
launch_template:
pre_bootstrap_command: |
#!/bin/bash
# Verify that IP forwarding is enabled for worker nodes, as is required for containerd
if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
# Set ECR as default container registry mirror
mkdir -p /etc/containerd/certs.d/_default
ECR_TOKEN="$(aws ecr get-login-password --region us-east-1)"
BASIC_AUTH="$(echo -n "AWS:$ECR_TOKEN" | base64 -w 0)"
cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
[host."https://[PRIVATE_REPO].dkr.ecr.us-east-1.amazonaws.com"]
capabilities = ["pull", "resolve"]
[host."https://[PRIVATE_REPO].dkr.ecr.us-east-1.amazonaws.com".header]
authorization = "Basic $BASIC_AUTH"
EOT
```

### Set GitLab CR as default container registry mirror

```
# Set GitLab CR as default container registry mirror in hosts.toml;
# must have override_path set if project/group names don't match upstream container
amazon_web_services:
node_groups:
general:
instance: m5.2xlarge
launch_template:
pre_bootstrap_command: |
#!/bin/bash
# Verify that IP forwarding is enabled for worker nodes, as is required for containerd
if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
# Set default container registry mirror in hosts.toml; must have override_path set if project/group names don't match upstream container
CONTAINER_REGISTRY_URL="[PRIVATE_REPO]"
CONTAINER_REGISTRY_USERNAME="[username]"
CONTAINER_REGISTRY_TOKEN="[token]"
CONTAINER_REGISTRY_GROUP=as-nebari
CONTAINER_REGISTRY_PROJECT=nebari-test
mkdir -p /etc/containerd/certs.d/_default
cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
[host."https://$CONTAINER_REGISTRY_URL/v2/$CONTAINER_REGISTRY_GROUP/$CONTAINER_REGISTRY_PROJECT"]
override_path = true
capabilities = ["pull", "resolve"]
EOT
# Set containerd registry config auth in config.d .toml import dir
mkdir -p /etc/containerd/config.d
cat <<EOT | sudo tee /etc/containerd/config.d/config-import.toml
version = 2
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.configs."$CONTAINER_REGISTRY_URL".auth]
username = "$CONTAINER_REGISTRY_USERNAME"
password = "$CONTAINER_REGISTRY_TOKEN"
EOT
```

### Set GitLab CR as default container registry mirror, with custom Client SSL/TLS Certs

```
# must have override_path set if project/group names don't match upstream container
# Also add/set GitLab Client SSL/TLS Certificate for Containerd
amazon_web_services:
node_groups:
general:
instance: m5.2xlarge
launch_template:
pre_bootstrap_command: |
#!/bin/bash
# Verify that IP forwarding is enabled for worker nodes, as is required for containerd
if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
# Set default container registry mirror in hosts.toml; must have override_path set if project/group names don't match upstream container
CONTAINER_REGISTRY_URL="[PRIVATE_REPO]"
CONTAINER_REGISTRY_USERNAME="[username]"
CONTAINER_REGISTRY_TOKEN="[token]"
CONTAINER_REGISTRY_GROUP=as-nebari
CONTAINER_REGISTRY_PROJECT=nebari-test
mkdir -p /etc/containerd/certs.d/_default
cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
[host."https://$CONTAINER_REGISTRY_URL/v2/$CONTAINER_REGISTRY_GROUP/$CONTAINER_REGISTRY_PROJECT"]
override_path = true
capabilities = ["pull", "resolve"]
client = ["/etc/containerd/certs.d/$CONTAINER_REGISTRY_URL/client.pem"]
EOT
# Set containerd registry config auth in config.d .toml import dir
mkdir -p /etc/containerd/config.d
cat <<EOT | sudo tee /etc/containerd/config.d/config-import.toml
version = 2
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.configs."$CONTAINER_REGISTRY_URL".auth]
username = "$CONTAINER_REGISTRY_USERNAME"
password = "$CONTAINER_REGISTRY_TOKEN"
EOT
# Add client key/cert to containerd
mkdir -p /etc/containerd/certs.d/$CONTAINER_REGISTRY_URL
cat <<-EOT >> /etc/containerd/certs.d/$CONTAINER_REGISTRY_URL/client.pem
-----BEGIN CERTIFICATE-----
XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
-----END CERTIFICATE-----
-----BEGIN PRIVATE KEY-----
XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
-----END PRIVATE KEY-----
EOT
```
62 changes: 62 additions & 0 deletions docs/docs/references/enhanced-security.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
## Nebari Security Considerations

The security of _AWS Nebari_ deployments can be enhanced through the following deployment configuration options in `nebari-config.yaml`:

- **Explicit definition of container sources**
This option allows for the use of locally mirrored, security-hardened, or otherwise customized container images in place of the containers used by default.
See: [container-sources](container-sources.md)

- **Installation of custom SSL certificate(s) into EKS hosts**
Install private certificates used by (e.g.) in-line content inspection engines which re-encrypt traffic.

```
# Add client certificate to CA trust on node
amazon_web_services:
node_groups:
general:
instance: m5.2xlarge
launch_template:
pre_bootstrap_command: |
#!/bin/bash
cat <<-EOT >> /etc/pki/ca-trust/source/anchors/client.pem
-----BEGIN CERTIFICATE-----
XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
-----END CERTIFICATE-----
EOT
sudo update-ca-trust extract
```

- **Private EKS endpoint configuration**
Mirrors the corresponding AWS console option, which routes all EKS traffic within the VPC.

```
amazon_web_services:
eks_endpoint_access: private # valid values: [public, private, public_and_private]
```

- **Deploy into existing subnets**
Instructs Nebari to be deployed into existing subnets, rather than creating its own new subnets.
An advantage of deploying to existing subnets is the ability to use private subnets. Note that the **ingress load-balancer-annotation** must be set appropriately based on the type (private or public) of subnet.

```
existing_subnet_ids:
- subnet-0123456789abcdef
- subnet-abcdef0123456789
existing_security_group_id: sg-0123456789abcdef
ingress:
terraform_overrides:
load-balancer-annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
# Ensure the subnet IDs are also set below
service.beta.kubernetes.io/aws-load-balancer-subnets: "subnet-0123456789abcdef,subnet-abcdef0123456789"
```

- **Use existing SSL certificate**
Instructs Nebari to use the SSL certificate specified by `[k8s-custom-secret-name]`

```
certificate:
type: existing
secret_name: [k8s-custom-secret-name]
```
4 changes: 3 additions & 1 deletion docs/docs/references/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
/>
</div>

Nitty-gritty technical descriptions of how Nebari works.
Technical descriptions of how Nebari works.

- [Enhanced Security](enhanced-security.md) - Nebari security configuration guide
- [Local Container Repo](container-sources.md) - Deploying Nebari from a Local Container Repo
<DocCardList items={useCurrentSidebarCategory().items}/>
3 changes: 1 addition & 2 deletions docs/nebari-slurm/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,7 @@ _Note_: All slurm related configuration needs to be passed down as a string.
### Services
Additional services can be added to the `jupyterhub_services`
variable. Currently this is only `<service-name>:
<service-apikey>`. You must keep the `dask_gateway` section.
variable. Currently this is only `<service-name>: <service-apikey>`. You must keep the `dask_gateway` section.

```yaml
jupyterhub_services:
Expand Down

0 comments on commit 2d4e23d

Please sign in to comment.