diff --git a/docs/eks-networking.md b/docs/eks-networking.md new file mode 100644 index 00000000..a3a547ef --- /dev/null +++ b/docs/eks-networking.md @@ -0,0 +1,81 @@ +# Understanding VPC configuration for EKS in conjunction with Fargate (WIP*) + +*\*All of this is subject to change while this message is here.* + +Through work on [a security compliance issue](https://github.com/GSA/datagov-deploy/issues/3355), a thorough inspection of the networking design of this repo was completed. By default, EKS clusters are fully publicly available. The desire was to allow tighter configruation to prevent hacks/data leaks. A combination of public+private networking is considered the [best practice for common Kubernetes workloads on AWS](https://aws.amazon.com/blogs/containers/de-mystifying-cluster-networking-for-amazon-eks-worker-nodes/) as it provides the flexibility of public availability alongside the security of private resources. + +Note: This repo utilizes Terraform to configure multiple intertwining parts from the AWS world to the Kubernetes world and wraps it up nicely with a Brokerpak bow. Most of the concepts and commands are discussed in terms of Terraform, but there are AWS/K8S cli equivalents. + +Here's a non-detailed exhaustive list of modules/resources used: +- Module [vpc](https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/14.0.0) +- Module [eks](https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/3.7.0) +- Module [aws_load_balancer_controller](https://github.com/GSA/terraform-kubernetes-aws-load-balancer-controller) +- Resource [aws_vpc_endpoint](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint) +- Resource [aws_route53_resolver_endpoint](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/route53_resolver_endpoint) + +## Desired Configuration/Design + +The user would like to provision a functional K8S cluster with the ability to host publicly-available application deployments. The cluster will live in AWS Fargate to reduce the compliance burden of managing the security of node machines. + +### Deployment Stack: +- Computing Levels of Abstraction: + - Fargate Nodes > EKS > Application +- Networking Levels of Abstraction (the order is still being learned): + - Internal CIDRs (Private + Public) > Network ACLs > Security Groups > Ingress Controller > NAT Gateway > Application Load Balancer > Elastic IP (EIP) > Domain + - VPC > NAT Gateway > Load Balancer > Elastic IP (EIP) + +When a user accesses an application through the [domain](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/ingress.tf#L248-L254), it gets resolved to an EIP that gets routed to the [application load balancer](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/ingress.tf#L19-L31). This then passes to the internal [ingress controller](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/ingress.tf#L36-L95) to the cluster nodes based on the [vpc configuration](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/vpc.tf#L30-L186). That last step is intricate because the ingress controller lives within the VPC, so it only works if the VPC configuration permits. + +### Networking Design + +- The entire EKS cluster lives within the VPC (10.20.0.0/16). +- There is a public subnet (10.20.101.0/24). +- There is a private subnet (10.20.1.0/24). +- The EKS control plane has a public endpoint (x.x.x.x/x). +- The EKS control plane has a private endpoint (10.20.x.x/x). +- Worker nodes, by default, communicate entirely on the private subnet. +- The ingress controllers connect external traffic to worker nodes through the public subnet. +- Security Groups and Network ACLs are used to control traffic. + +### Setting up Clusters in a Private Subnet + +In order for EKS to isolate workers in a private subnet, the following [VPC considerations](https://docs.aws.amazon.com/eks/latest/userguide/private-clusters.html) are necessary, + +- The VPC needs to configure private subnets, + - Define `private_subnets` cidrs. + - Set `enable_nat_gateway` (True) to allow ingresses to connect public and private subnets. + - Set `map_public_ip_on_launch` (False) to disable public ips being set on private subnets. + - Set `enable_dns_hostnames` and `enable_dns_support` to support DNS hostnames in the VPC (necessary for the [API Server](https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html)). + +- The EKS needs to know about the VPC config, + - [Example](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/eks.tf#L15-L28) + +- VPC Endpoints are necessary for private cluster nodes to talk to other AWS services, + - [These are the ones](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/vpc.tf#L196-L265) I've identified as necessary for us, + - com.amazonaws..ec2 + - com.amazonaws..ecr.api + - com.amazonaws..ecr.dkr + - com.amazonaws..s3 _– For pulling container images_ + - com.amazonaws..logs _– For CloudWatch Logs_ + - com.amazonaws..sts _– If using Cluster Autoscaler or IAM roles for service accounts_ + - com.amazonaws..elasticloadbalancing _– If using Application Load Balancers_ + - These are additional ones that may be necessary in the future, + - com.amazonaws..autoscaling _– If using Cluster Autoscaler_ + - com.amazonaws..appmesh-envoy-management _– If using App Mesh_ + - com.amazonaws..xray _– If using AWS X-Ray_ + +- [Security Group (SG) Considerations](https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html#cluster-sg) + - Control Plane + - Minimum Inbound traffic - 443/TCP from all node SGs + - Minimum Outbound traffic - 10250/TCP to all node SGs + - Nodes + - Minimum Inbound traffic - 10250/TCP from control plane SGs + - Minimum Outbound traffic - 443/TCP to control plane SGs + +- The [IAM Role](https://github.com/aws/amazon-vpc-cni-k8s/issues/30) needs to allow Nodes to pull images. + - Docs: https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-policy-examples.html + - [Terraform implementation](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/eks.tf#L150-L166) of creating the role policy + +- A [aws_route53_resolver_endpoint](https://github.com/GSA/eks-brokerpak/blob/restrict-eks-traffic/terraform/provision/vpc.tf#L14-L28) needs to be made available to the private subnet. + +- Careful consideration needs to be put towards the [user/roles](https://stackoverflow.com/questions/66996306/aws-eks-fargate-coredns-imagepullbackoff) for Fargate cluster creations. diff --git a/eks-service-definition.yml b/eks-service-definition.yml index a1797e9a..9f125936 100644 --- a/eks-service-definition.yml +++ b/eks-service-definition.yml @@ -30,6 +30,18 @@ provision: pattern: ^[a-z0-9][a-z0-9-]*[a-z0-9]$ default: ${str.truncate(64, "${request.instance_id}")} details: A subdomain to use for the cluster instance. Default is the instance ID. + - field_name: egress_allowed + required: false + type: array + details: "A list of IP ranges to allow egress traffic to (ex. [\"x.x.x.x/x\", ...])" + overwrite: true + default: null + - field_name: ingress_allowed + required: false + type: array + details: "A list of IP ranges to allow ingress traffic from (ex. [\"x.x.x.x/x\", ...])" + overwrite: true + default: null computed_inputs: - name: instance_name required: true @@ -51,6 +63,7 @@ provision: default: ${config("aws.default_region")} - name: write_kubeconfig type: boolean + required: false overwrite: true default: false outputs: diff --git a/network_policy/2048_fixture.yml b/network_policy/2048_fixture.yml new file mode 100644 index 00000000..ec9cb84c --- /dev/null +++ b/network_policy/2048_fixture.yml @@ -0,0 +1,53 @@ +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: deployment-2048 +spec: + selector: + matchLabels: + app.kubernetes.io/name: app-2048 + replicas: 2 + template: + metadata: + labels: + app.kubernetes.io/name: app-2048 + spec: + containers: + - image: alexwhen/docker-2048 + imagePullPolicy: Always + name: app-2048 + ports: + - containerPort: 80 +--- +apiVersion: v1 +kind: Service +metadata: + name: service-2048 +spec: + ports: + - port: 80 + targetPort: 80 + protocol: TCP + type: ClusterIP + selector: + app.kubernetes.io/name: app-2048 +--- +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: ingress-2048 + annotations: + kubernetes.io/ingress.class: nginx + nginx.ingress.kubernetes.io/rewrite-target: / +spec: + rules: + - http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: service-2048 + port: + number: 80 \ No newline at end of file diff --git a/network_policy/README.md b/network_policy/README.md new file mode 100644 index 00000000..5696f586 --- /dev/null +++ b/network_policy/README.md @@ -0,0 +1,39 @@ +Context: + +This sub-directory provides a test case for creating and applying network +policies. It.. +- Sets up a basic kind cluster (specifying the pod-network-cidr) +- Installs an nginx ingress to make services available outside of the cluster +- Installs calico as the network plugin to manage network policies + +The test case uses the 2048 service as a baseline of seeing network +restrictions. Different network policies are then applied to allow/restrict +network traffic. + +Instuctions: + +To setup environment, + +`./startup.sh` + +To tear down envrionment, + +`./shutdown.sh` + +To create 2048 game, + +`kubectl apply -f 2048_fixture.yml` + +To apply network policy, + +`kubectl apply -f test_deny.yml` + +To test egress traffic, + +`kubectl exec -it pod/<2048-pod> -- sh -c "ping -c 4 8.8.8.8"` + +To test ingress traffic, + +Visit the 2048 game, default url is http://default-http-backend/ +Note: Make sure to add the host to ip translation in /etc/hosts or similar +`127.0.0.1 default-http-backend` diff --git a/network_policy/kind-config.yaml b/network_policy/kind-config.yaml new file mode 100644 index 00000000..0add0d78 --- /dev/null +++ b/network_policy/kind-config.yaml @@ -0,0 +1,29 @@ +# four node (three workers) cluster config +kind: Cluster +apiVersion: kind.x-k8s.io/v1alpha4 +networking: + podSubnet: "10.0.0.0/8" +nodes: +- role: control-plane + image: kindest/node:v1.19.11@sha256:07db187ae84b4b7de440a73886f008cf903fcf5764ba8106a9fd5243d6f32729 + # Mapping an ingress controller to host ports + # See docs at https://kind.sigs.k8s.io/docs/user/ingress/#create-cluster + kubeadmConfigPatches: + - | + kind: InitConfiguration + nodeRegistration: + kubeletExtraArgs: + node-labels: "ingress-ready=true" + extraPortMappings: + - containerPort: 80 + hostPort: 80 + protocol: TCP + - containerPort: 443 + hostPort: 443 + protocol: TCP +- role: worker + image: kindest/node:v1.19.11@sha256:07db187ae84b4b7de440a73886f008cf903fcf5764ba8106a9fd5243d6f32729 +- role: worker + image: kindest/node:v1.19.11@sha256:07db187ae84b4b7de440a73886f008cf903fcf5764ba8106a9fd5243d6f32729 +- role: worker + image: kindest/node:v1.19.11@sha256:07db187ae84b4b7de440a73886f008cf903fcf5764ba8106a9fd5243d6f32729 diff --git a/network_policy/shutdown.sh b/network_policy/shutdown.sh new file mode 100755 index 00000000..3f31392c --- /dev/null +++ b/network_policy/shutdown.sh @@ -0,0 +1 @@ +kind delete cluster --name datagov-broker-test diff --git a/network_policy/startup.sh b/network_policy/startup.sh new file mode 100755 index 00000000..b3da5af9 --- /dev/null +++ b/network_policy/startup.sh @@ -0,0 +1,12 @@ +# Creating a temporary Kubernetes cluster to test against with KinD +kind create cluster --config kind-config.yaml --name datagov-broker-test + +# Install a KinD-flavored ingress controller (to make the Solr instances visible to the host). +# See (https://kind.sigs.k8s.io/docs/user/ingress/#ingress-nginx for details. +kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.1/deploy/static/provider/kind/deploy.yaml +kubectl wait --namespace ingress-nginx \ + --for=condition=ready pod \ + --selector=app.kubernetes.io/component=controller \ + --timeout=270s + +kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml diff --git a/network_policy/test_deny.yml b/network_policy/test_deny.yml new file mode 100644 index 00000000..a6b2a6c3 --- /dev/null +++ b/network_policy/test_deny.yml @@ -0,0 +1,9 @@ +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: ingress-default-deny +spec: + podSelector: {} + policyTypes: + - Ingress + - Egress diff --git a/terraform/provision/Dockerfile b/terraform/provision/Dockerfile index 35fb0f2e..f165bc5e 100644 --- a/terraform/provision/Dockerfile +++ b/terraform/provision/Dockerfile @@ -5,6 +5,6 @@ FROM alpine/k8s:1.20.7 COPY --from=terraform /bin/terraform /bin/terraform RUN apk update -RUN apk add --update git +RUN apk add --update git bind-tools ENTRYPOINT ["/bin/sh"] diff --git a/terraform/provision/crds.tf b/terraform/provision/crds.tf index a1e34c13..c94fc410 100644 --- a/terraform/provision/crds.tf +++ b/terraform/provision/crds.tf @@ -7,11 +7,11 @@ # solr-operator do it so that it will register and unregister its CRDs as part # of the helm install process. resource "helm_release" "zookeeper-operator" { - name = "zookeeper" - chart = "zookeeper-operator" - repository = "https://charts.pravega.io/" - version = "0.2.12" - namespace = "kube-system" + name = "zookeeper" + chart = "zookeeper-operator" + repository = "https://charts.pravega.io/" + version = "0.2.12" + namespace = "kube-system" set { # See https://github.com/pravega/zookeeper-operator/issues/324#issuecomment-829267141 name = "hooks.delete" diff --git a/terraform/provision/eks.tf b/terraform/provision/eks.tf index 6e576eb2..3c1bc8b0 100644 --- a/terraform/provision/eks.tf +++ b/terraform/provision/eks.tf @@ -9,17 +9,29 @@ module "eks" { # module versions above 14.0.0 do not work with Terraform 0.12, so we're stuck # on that version until the cloud-service-broker can use newer versions of # Terraform. - version = "~>14.0" - cluster_name = local.cluster_name - cluster_version = local.cluster_version - vpc_id = module.vpc.aws_vpc_id - subnets = module.vpc.aws_subnet_private_prod_ids - cluster_enabled_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"] - cluster_log_retention_in_days = 180 - manage_aws_auth = false - write_kubeconfig = var.write_kubeconfig - tags = merge(var.labels, { "domain" = local.domain }) - iam_path = "/${replace(local.cluster_name, "-", "")}/" + version = "~>14.0" + cluster_name = local.cluster_name + cluster_version = local.cluster_version + vpc_id = module.vpc.vpc_id + subnets = module.vpc.private_subnets + + # PRIVATE: Have EKS manage SG rules to allow private subnets access to endpoints + cluster_create_endpoint_private_access_sg_rule = true + # PRIVATE: Have EKS manage SG rules to allow worker nodes access to the control plane + worker_create_cluster_primary_security_group_rules = true + # PRIVATE: Enable the API Endpoint for private subnets + cluster_endpoint_private_access = true + # PRIVATE: Specify private subnets to allow access to API Endpoint + cluster_endpoint_private_access_cidrs = module.vpc.private_subnets_cidr_blocks + + cluster_enabled_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"] + cluster_log_retention_in_days = 180 + manage_aws_auth = false + write_kubeconfig = var.write_kubeconfig + tags = merge(var.labels, { "domain" = local.domain }) + + # Setting this prevents managed nodes from joining the cluster + # iam_path = "/${replace(local.cluster_name, "-", "")}/" create_fargate_pod_execution_role = false # fargate_pod_execution_role_name = aws_iam_role.iam_role_fargate.name # fargate_profiles = { @@ -32,6 +44,18 @@ module "eks" { # namespace = "kube-system" # } # } + + node_groups = { + system_node_group = { + name = "test8" + + min_capacity = 1 + + instance_types = ["m5.large"] + capacity_type = "ON_DEMAND" + } + } + } resource "aws_iam_role" "iam_role_fargate" { @@ -42,13 +66,34 @@ resource "aws_iam_role" "iam_role_fargate" { Action = "sts:AssumeRole" Effect = "Allow" Principal = { - Service = "eks-fargate-pods.amazonaws.com" + Service = [ + "ec2.amazonaws.com", + "eks-fargate-pods.amazonaws.com" + ] } }] Version = "2012-10-17" }) } +# Policy to enable Managed Node Management +resource "aws_iam_role_policy_attachment" "AmazonEKSWorkerNodePolicy" { + policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy" + role = aws_iam_role.iam_role_fargate.name +} + +# Policy to enable CNI EKS ADDON +resource "aws_iam_role_policy_attachment" "AmazonEKS_CNI_Policy" { + policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" + role = aws_iam_role.iam_role_fargate.name +} + +# Policy to allow containers to be deployed to Managed Nodes +resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" { + policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" + role = aws_iam_role.iam_role_fargate.name +} + resource "aws_iam_role_policy_attachment" "AmazonEKSFargatePodExecutionRolePolicy" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy" role = aws_iam_role.iam_role_fargate.name @@ -63,7 +108,7 @@ resource "aws_eks_fargate_profile" "default_namespaces" { cluster_name = local.cluster_name fargate_profile_name = "default-namespaces-${local.cluster_name}" pod_execution_role_arn = aws_iam_role.iam_role_fargate.arn - subnet_ids = module.vpc.aws_subnet_private_prod_ids + subnet_ids = module.vpc.private_subnets tags = var.labels timeouts { # For reasons unknown, Fargate profiles can take upward of 20 minutes to @@ -97,6 +142,11 @@ resource "null_resource" "cluster-functional" { # functional until coredns is operating (for example, helm deployments may # timeout). When another resource depends_on this one, it won't apply until # the cluster is fully functional. + # + # Temporary workaround, use public coredns image + # kubectl --kubeconfig <(echo $KUBECONFIG | base64 -d) \ + # set image --namespace kube-system deployment.apps/coredns \ + # coredns=coredns/coredns:1.8.0 command = <<-EOF kubectl --kubeconfig <(echo $KUBECONFIG | base64 -d) \ patch deployment coredns \ @@ -126,3 +176,22 @@ data "aws_eks_cluster_auth" "main" { name = module.eks.cluster_id } +# Allow Nodes to pull Images with Fargate Role +# https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-policy-examples.html +resource "aws_iam_role_policy" "cluster-images" { + name = "allow-image-pull" + role = aws_iam_role.iam_role_fargate.name + policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Action = [ + "ecr:BatchGetImage", + "ecr:GetDownloadUrlForLayer", + ] + Effect = "Allow" + Resource = "*" + } + ] + }) +} diff --git a/terraform/provision/ingress.tf b/terraform/provision/ingress.tf index 5ac0ebf9..ad9ffcbb 100644 --- a/terraform/provision/ingress.tf +++ b/terraform/provision/ingress.tf @@ -42,7 +42,7 @@ resource "helm_release" "ingress_nginx" { namespace = "kube-system" cleanup_on_fail = "true" atomic = "true" - timeout = 600 + timeout = 1200 dynamic "set" { for_each = { @@ -62,7 +62,7 @@ resource "helm_release" "ingress_nginx" { "rbac.create" = true, "clusterName" = module.eks.cluster_id, "region" = local.region, - "vpcId" = module.vpc.aws_vpc_id, + "vpcId" = module.vpc.vpc_id, "aws_iam_role_arn" = module.aws_load_balancer_controller.aws_iam_role_arn } content { diff --git a/terraform/provision/providers.tf b/terraform/provision/providers.tf index c0278cef..b082e891 100644 --- a/terraform/provision/providers.tf +++ b/terraform/provision/providers.tf @@ -5,6 +5,10 @@ provider "aws" { region = local.region } +provider "dns" { + version = "3.2.1" +} + # A separate provider for creating KMS keys in the us-east-1 region, which is required for DNSSEC. # See https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-configuring-dnssec-cmk-requirements.html provider "aws" { @@ -21,8 +25,8 @@ provider "kubernetes" { exec { api_version = "client.authentication.k8s.io/v1alpha1" - args = ["token", "--cluster-id", data.aws_eks_cluster.main.id] - command = "aws-iam-authenticator" + args = ["eks", "get-token", "--cluster-name", local.cluster_name] + command = "aws" } version = "~>2.5" } @@ -34,8 +38,8 @@ provider "helm" { token = data.aws_eks_cluster_auth.main.token exec { api_version = "client.authentication.k8s.io/v1alpha1" - args = ["token", "--cluster-id", data.aws_eks_cluster.main.id] - command = "aws-iam-authenticator" + args = ["eks", "get-token", "--cluster-name", local.cluster_name] + command = "aws" } } diff --git a/terraform/provision/terraform.tfvars-template b/terraform/provision/terraform.tfvars-template index 3b23ecfa..9ea7b9ea 100644 --- a/terraform/provision/terraform.tfvars-template +++ b/terraform/provision/terraform.tfvars-template @@ -3,3 +3,26 @@ zone="parent domain" # pre-existing zone, created outside Terraform (eg s instance_name="my-instance" # a unique name to avoid collisions in AWS subdomain="my-subdomain" # a unique subdomain name to avoid collisions in AWS write_kubeconfig = true # generate a kubeconfig (only here for dev/test iteration) + +# egress_allowed = ["0.0.0.0/0"] # Open Egress Traffic to everywhere +# ingress_allowed = ["0.0.0.0/0"] # Open Ingress Traffic from everywhere + +# The following configuration is an example of restricting traffic to only that +# related to the EKS cluster. Most of the IP ranges need to be dynamically found +# and added after the EKS cluster is operational. + +# Currently, the traffic is controlled by the VPC which is stateless and won't +# allow 'return traffic' from an IP that was allowed in Egress but not in Ingress +# (or vice versa). +egress_allowed = [ + "10.20.0.0/16", + "3.5.76.0/22", "3.5.80.0/21", "52.218.128.0/17", "52.92.128.0/17", // com.amazonaws.us-west-2.s3 + "52.26.201.61/32", "44.241.219.74/32", // sub-nickumia5.ssb-dev.data.gov + "52.12.105.32/32", "44.241.94.60/32", // CEC18BF55F08D9811964E7908A55D84D.gr7.us-west-2.eks.amazonaws.com +] +ingress_allowed = [ + "10.20.0.0/16", + "3.5.76.0/22", "3.5.80.0/21", "52.218.128.0/17", "52.92.128.0/17", // com.amazonaws.us-west-2.s3 + "52.26.201.61/32", "44.241.219.74/32", //sub-nickumia5.ssb-dev.data.gov + "52.12.105.32/32", "44.241.94.60/32", // CEC18BF55F08D9811964E7908A55D84D.gr7.us-west-2.eks.amazonaws.com +] diff --git a/terraform/provision/variables.tf b/terraform/provision/variables.tf index 4590400f..9c63d1a8 100644 --- a/terraform/provision/variables.tf +++ b/terraform/provision/variables.tf @@ -22,7 +22,19 @@ variable "region" { type = string } +variable "ingress_allowed" { + type = list + description = "A list of IP Range [\"x.x.x.x/x\", ...] to allow ingress traffic" + default = null +} + +variable "egress_allowed" { + type = list + description = "A list of IP Range [\"x.x.x.x/x\", ...] to allow egress traffic" + default = null +} + variable "write_kubeconfig" { type = bool default = false -} \ No newline at end of file +} diff --git a/terraform/provision/vpc.tf b/terraform/provision/vpc.tf index 844c32dd..ad2a2af9 100644 --- a/terraform/provision/vpc.tf +++ b/terraform/provision/vpc.tf @@ -1,29 +1,155 @@ locals { - region = var.region + region = var.region + domain_ip_file = "${path.module}/domain_ip" +} + +data "aws_availability_zones" "available" { } module "vpc" { - source = "github.com/FairwindsOps/terraform-vpc.git?ref=v5.0.1" + # Version 3.8+ require Terraform 0.13+ + # Version 3.7+ require AWS 3.38+ + source = "terraform-aws-modules/vpc/aws" + version = "3.6.0" - aws_region = local.region - az_count = 2 - aws_azs = "${local.region}b, ${local.region}c" - single_nat_gateway = 1 - multi_az_nat_gateway = 0 + name = "eks-vpc" + # This cidr range was used by the old VPC module, it was kept for consistency + # The accompanying subnets were choosen using tutorial as example + # https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/3.7.0#usage + cidr = "10.20.0.0/16" + private_subnets = ["10.20.1.0/24", "10.20.2.0/24", "10.20.3.0/24"] + public_subnets = ["10.20.101.0/24", "10.20.102.0/24", "10.20.103.0/24"] - enable_s3_vpc_endpoint = "true" + azs = data.aws_availability_zones.available.names + enable_nat_gateway = true + single_nat_gateway = true + enable_vpn_gateway = true + + enable_dns_hostnames = true + enable_dns_support = true # Tag subnets for use by AWS' load-balancers and the ALB ingress controllers # See https://aws.amazon.com/premiumsupport/knowledge-center/eks-vpc-subnet-discovery/ - global_tags = merge(var.labels, { + tags = merge(var.labels, { "kubernetes.io/cluster/${local.cluster_name}" = "shared", "domain" = local.domain }) public_subnet_tags = { - "kubernetes.io/role/elb" = 1 + "kubernetes.io/cluster/${local.cluster_name}" = "shared" + "kubernetes.io/role/elb" = 1 + } + private_subnet_tags = { + "kubernetes.io/cluster/${local.cluster_name}" = "shared" + "kubernetes.io/role/internal-elb" = 1 } - private_prod_subnet_tags = { - "kubernetes.io/role/internal-elb" = 1 + +} + +# CNI addon for VPC +resource "aws_eks_addon" "cni" { + cluster_name = module.eks.cluster_id + addon_name = "vpc-cni" +} + +data "tls_certificate" "eks-cni" { + url = data.aws_eks_cluster.main.identity[0].oidc[0].issuer +} + +data "aws_iam_policy_document" "cni_assume_role_policy" { + statement { + actions = ["sts:AssumeRoleWithWebIdentity"] + effect = "Allow" + + condition { + test = "StringEquals" + variable = "${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:sub" + values = ["system:serviceaccount:kube-system:aws-node"] + } + + principals { + identifiers = [aws_iam_openid_connect_provider.cluster.arn] + type = "Federated" + } } } +# THIS CAN PROBABLY BE DELETED SOON +# Create VPC Endpoints for private subnets to connect to the following services, +# - S3 (for pulling images) +# - EC2 +# - ECR API +# - ECR DKR +# - LOGS (for CloudWatch Logs) +# - STS (for IAM role access) +# - ELASTICLOADBALANCING (for Application Load Balancers) +# resource "aws_vpc_endpoint" "s3" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.s3", local.region) +# } +# +# resource "aws_vpc_endpoint" "ec2" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.ec2", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# # subnet_ids = flatten([module.vpc.private_subnets, module.vpc.public_subnets]) +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# } +# +# resource "aws_vpc_endpoint" "api" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.ecr.api", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# } +# +# resource "aws_vpc_endpoint" "dkr" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.ecr.dkr", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# } +# +# resource "aws_vpc_endpoint" "logs" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.logs", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# } +# +# resource "aws_vpc_endpoint" "elb" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.elasticloadbalancing", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# } +# +# resource "aws_vpc_endpoint" "iam" { +# vpc_id = module.vpc.vpc_id +# service_name = format("com.amazonaws.%s.sts", local.region) +# vpc_endpoint_type = "Interface" +# security_group_ids = [ +# module.vpc.default_security_group_id, +# ] +# subnet_ids = module.vpc.private_subnets +# private_dns_enabled = true +# }