Opinionated Terraform module for creating a Highly Available Kubernetes cluster running on
CoreOS (any channel) in an AWS
Virtual Private Cloud (VPC). With prerequisites
installed make all
will simply spin up a default cluster; and, since it is
based on Terraform, customization is much easier than
CloudFormation.
The default configuration includes Kubernetes add-ons: DNS, Dashboard and UI.
# prereqs
$ brew update && brew install awscli cfssl jq kubernetes-cli terraform
# build artifacts and deploy cluster
$ make all
# nodes
$ kubectl get nodes
# addons
$ kubectl get pods --namespace=kube-system
# verify dns - run after addons have fully loaded
$ kubectl exec busybox -- nslookup kubernetes
# open dashboard
$ make dashboard
# obliterate the cluster and all artifacts
$ make clean
- TLS certificate generation
- EC2 Key Pair creation
- AWS VPC Public and Private subnets
- IAM protected S3 bucket for asset (TLS and manifests) distribution
- Bastion Host
- Multi-AZ Auto-Scaling Worker Nodes
- NAT Gateway
- etcd DNS Discovery Bootstrap
- kubelet runs under rkt (using CoreOS recommended Kubelet Wrapper Script)
- Highly Available ApiServer Configuration
- Service accounts enabled
- SkyDNS utilizing cluster's etcd
- CoreOS AMI sourcing
- Terraform Pattern Modules
Quick install prerequisites on Mac OS X with Homebrew:
$ brew update && brew install awscli cfssl jq kubernetes-cli terraform
Tested with prerequisite versions:
$ aws --version
aws-cli/1.11.15 Python/2.7.10 Darwin/16.1.0 botocore/1.4.72
$ cfssl version
Version: 1.2.0
Revision: dev
Runtime: go1.7.1
$ jq --version
jq-1.5
$ kubectl version --client
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.5+5a0a696", GitCommit:"5a0a696437ad35c133c0c8493f7e9d22b0f9b81b", GitTreeState:"not a git tree", BuildDate:"2016-10-29T08:29:44Z", GoVersion:"go1.7.3", Compiler:"gc", Platform:"darwin/amd64"}
$ terraform --version
Terraform v0.8.2
make all
will create:
- AWS Key Pair (PEM file)
- client and server TLS assets
- s3 bucket for TLS assets (secured by IAM roles for master and worker nodes)
- AWS VPC with private and public subnets
- Route 53 internal zone for VPC
- Etcd cluster bootstrapped from Route 53
- High Availability Kubernetes configuration (masters running on etcd nodes)
- Autoscaling worker node group across subnets in selected region
- kube-system namespace and addons: DNS, UI, Dashboard
$ make all
To open dashboard:
$ make dashboard
To display instance information:
$ make instances
To display status:
$ make status
To destroy, remove and generally undo everything:
$ make clean
make all
and make clean
should be idempotent - should an error occur simply try running
the command again and things should recover from that point.
Tack works in three phases:
- Pre-Terraform
- Terraform
- Post-Terraform
The purpose of this phase is to prep the environment for Terraform execution. Some tasks are hard or messy to do in Terraform - a little prep work can go a long way here. Determining the CoreOS AMI for a given region, channel and VM Type for instance is easy enough to do with a simple shell script.
Terraform does the heavy lifting of resource creation and sequencing. Tack uses local
modules to partition the work in a logical way. Although it is of course possible to do all
of the Terraform work in a single .tf
file or collection of .tf
files, it becomes
unwieldy quickly and impossible to debug. Breaking the work into local modules makes the
flow much easier to follow and provides the basis for composing variable solutions down the track - for example converting the worker Auto Scaling Group to use spot instances.
Once the infrastructure has been configured and instantiated it will take some time for it to settle. Waiting for the 'master' ELB to become healthy is an example of this.
Like many great tools, tack has started out as a collection of scripts, makefiles and other tools. As tack matures and patterns crystalize it will evolve to a Terraform plugin and perhaps a Go-based cli tool for 'init-ing' new cluster configurations. The tooling will compose Terraform modules into a solution based on user preferences - think npm init
or better yet yeoman.
$ curl --cacert /etc/kubernetes/ssl/ca.pem https://etcd1.k8s:2379/version
$ openssl x509 -text -noout -in /etc/kubernetes/ssl/ca.pem
$ openssl x509 -text -noout -in /etc/kubernetes/ssl/k8s-etcd.pem
To access Elasticseach and Kibana first start kubectl proxy
.
$ kubectl proxy
Starting to serve on localhost:8001
- http://localhost:8001/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
- http://localhost:8001/api/v1/proxy/namespaces/kube-system/services/kibana-logging
If you have an existing VPC you'd like to deploy a cluster into, there is an option for this with tack.
- You will need to allocate 3 static IPs for the etcd servers - Choose 3 unused IPs that fall within the IP range of the first subnet specified in
subnet-ids-private
undervpc-existing.tfvars
- Your VPC has to have private and public subnets (for now)
- You will need to know the following information:
- VPC CIDR Range (e.g. 192.168.0.0/16)
- VPC Id (e.g. vpc-abc123)
- VPC Internet Gateway Id (e.g. igw-123bbd)
- VPC Public Subnet Ids (e.g. subnet-xyz123,subnet-zyx123)
- VPC Private Subnet Ids (e.g. subnet-lmn123,subnet-opq123)
- Edit vpc-existing.tfvars
- Uncomment the blocks with variables and fill in the missing information
- Edit modules_override.tf - This uses the overrides feature from Terraform
- Uncomment the vpc module, this will override the reference to the regular VPC module and instead use the stub vpc-existing module which just pulls in the variables from vpc-existing.tfvars
- Edit the Makefile as necessary for CIDR_PODS, CIDR_SERVICE_CLUSTER, etc to match what you need (e.g. avoid collisions with existing IP ranges in your VPC or extended infrastructure)
In order to test existing VPC support, we need to generate a VPC and then try the overrides with it. After that we can clean it all up. These instructions are meant for someone wanting to ensure that the tack existing VPC code works properly.
- Run
make all
to generate a VPC with Terraform - Edit terraform.tfstate
- Search for the VPC block and cut it out and save it somewhere. Look for "path": ["root","vpc"]
- Run
make clean
to remove everything but the VPC and associated networking (we preserved it in the previous step) - Edit as per instructions above
- Run
make all
to test out using an existing VPC - Cleaning up:
- Re-insert the VPC block into terraform.tfstate
- Run
make clean
to clean up everything
- You probably want to tag your subnets for internal/external load balancers
- Code examples to create CoreOS cluster on AWS with Terraform by xuwang
- kaws: tool for deploying multiple Kubernetes clusters
- Kubernetes on CoreOS
- Terraform Infrastructure Design Patterns by Bart Spaans
- The infrastructure that runs Brandform
- bakins/kubernetes-coreos-terraform
- bobtfish/terraform-aws-coreos-kubernates-cluster
- chiefy/tf-aws-kubernetes
- cihangir/terraform-aws-kubernetes
- ericandrewlewis/kubernetes-via-terraform
- funkymonkeymonk/terraform-demo
- kelseyhightower/kubestack
- samsung-cnct/kraken
- wearemakery/kubestack-aws
- xuwang/aws-terraform
- CFSSL: CloudFlare's PKI and TLS toolkit
- CoreOS - Mounting Storage
- Deploying CoreOS cluster with etcd secured by TLS/SSL
- etcd dns discovery bootstrap
- Generate EC2 Key Pair
- Generate self-signed certificates
- Makefile
help
target - Peeking under the hood of Kubernetes on AWS
- Self documenting Makefile
- Setting up etcd to run in production
- ssl artifact generation