Skip to content

Project 2

bobbypr edited this page May 3, 2020 · 1 revision

Welcome to the ScrumLords wiki!

Firing up the weather service.

First, make sure you have cloned the project, are in the ScrumLords directory and inside an activated conda environment with the dependencies installed. Now, we shall begin.

1. Create Kubernetes Cluster

  • Create a Google Cloud Platform account if you don't already have one and sign up for the trial that gives you $300 to start with. The entire deploying and testing of this project shouldn't consume more than $100 worth of credits. Also, create a new project once you're in the console and note down its project-id. You'll need it throughout the project.

  • If you haven't already installed Google Cloud SDK (gcloud), you can find the instructions here. When installed, run gcloud init, which'll direct you to your browser where you'll need to login to your Google account. Once authenticated, open your previous terminal screen back up and select the project-id that you're going to work with. The consequent steps are pretty staightforward.

  • Open a new terminal tab in ScrumLords directory and run the following command

gcloud beta container --project "<YOUR_PROJECT_ID>" clusters create "weather-forecast-cluster" --zone "us-central1-c" --no-enable-basic-auth --cluster-version "1.14.10-gke.27" --machine-type "n1-standard-2" --image-type "COS" --disk-type "pd-standard" --disk-size "100" --metadata disable-legacy-endpoints=true --scopes "https://www.googleapis.com/auth/cloud-platform" --num-nodes "3" --enable-stackdriver-kubernetes --enable-ip-alias --network "projects/<YOUR_PROJECT_ID>/global/networks/default" --subnetwork "projects/<YOUR_PROJECT_ID>/regions/us-central1/subnetworks/default" --default-max-pods-per-node "110" --enable-autoscaling --min-nodes "3" --max-nodes "5" --addons HorizontalPodAutoscaling,HttpLoadBalancing --enable-autoupgrade --enable-autorepair

  • This should take a while. When done, run this command to let your kubectl know which cluster to point to:

gcloud container clusters get-credentials weather-forecast-cluster --zone us-central1-c --project <YOUR_PROJECT_ID>

2. Setup Helm (v3 recommended)

brew install helm for Mac users. For other os, look here.

helm repo add stable https://kubernetes-charts.storage.googleapis.com/

3. Setup Redis

Here, we'll set up a Redis cluster (1 Master + 2 Slaves) to complement our PubSub message broker service.

  • Add bitnami repo to your helm:

helm repo add bitnami https://charts.bitnami.com/bitnami

  • Install the redis chart:

helm install scrumlords-weather bitnami/redis

  • Get Redis Password:

export REDIS_PASSWORD=$(kubectl get secret --namespace default scrumlords-weather-redis -o jsonpath="{.data.redis-password}" | base64 --decode)

  • In each service, there's a production_jinja.json file in the deployment folder. Services that have REDIS_HOST environment variable need to be updated with the password obtained from the previous step. The host name is structured like this:

redis://:<REDIS_PASSWORD>@scrumlords-weather-redis-master.default.svc.cluster.local:6379

  • Replace the REDIS_PASSWORD with your password.

4. Setup NGINX configuration

Since Google's Load Balancer doesn't natively support https load balancing, we'll use nginx. To setup nginx configuration, run:

kubectl apply -f services/manager/deployment/nginx-setup.yaml

5. Deploy NGINX Load Balancer

kubectl apply -f services/manager/deployment/nginx-ingress-load-balancer-service.yaml

6. Deploy all services

./deploy_init.sh

7. Setup Cert-Manager

We use cert-manager to automatically provision and manage TLS certificates in our Kubernetes cluster. You can read more about it here.

kubectl create namespace cert-manager

kubectl apply --validate=false -f services/manager/deployment/cert-manager.yaml

8. Get Load Balancer Ingress external IP

We need this external IP to create 'A' records in the next step.

kubectl get svc --namespace=ingress-nginx

9. Create 'A' records on your Domain Management Service.

Since I own the domain bobbyrathore.com, I can create 'A' records like manager.bobbyrathore.com and session.bobbyrathore.com that each point to the external IP of the load balancer. My fanout ingress resource will take care of the routes later on, i.e. manager.bobbyrathore.com should redirect to my manager service, session.bobbyrathore.com to my session_manager service and so on, for instance. If you do not own a domain name, you can buy one for like $10 from Google Domains, or you can send me your desired subdomain and the external IP and I'll add them to my DNS service. For eg. managertemp.bobbyrathore.com for your manager service.

10. Create LetsEncrypt Staging and Production Issuer

Cert Manager uses LetsEncrypt to generate free TLS certificates. Read more about it here.

kubectl create -f services/manager/deployment/staging-issuer.yaml

kubectl create -f services/manager/deployment/production-issuer.yaml

11. Create Ingress Resource

kubectl apply -f services/manager/deployment/nginx-ingress-resource.yaml

12. Check for certificate completion

Wait for the cert-manager to issue valid TLS certificates from the LetsEncrypt acme server.

kubectl describe certificate

If the message field shows "Certificate is up to date and has not expired", all your services are completely deployed and exposed on the endpoints that we defined.

13. Log on to ScrumLordsWeather!


CI/CD

We used Google Cloud Build for CI/CD. By creating build triggers for a push to each microservice branch, we ensured that only that service gets deployed with new changes. Cloud build looks for a cloudbuild.yaml (or json) that gives it instructions to follow a certain sequence of steps. The steps we defined are as follows:

  1. Run test cases
  2. Pull base image from GCR
  3. Build new image cached from base
  4. Tag image with commit
  5. Tag image with branch-name
  6. Push commit tagged image to GCR
  7. Apply new deployment
  8. Push branch-name tagged image to GCR

Sample build run


Load Testing Results

1 Replica (3000 users | 200 sec ramp-up time)

Graph Summary Report

3 Replicas (3000 users | 100 sec ramp-up time)

Graph

Summary report

Aggregate report

5 Replicas (3000 users | 100 sec ramp-up time)

Graph

Summary report

Aggregate Report

5 Replicas - Breaking Point (3000 users | 5 sec ramp-up time)

Graph

Summary report

Aggregate Report

System Status Report


Google Cloud Console Snapshots

Kubernetes Cluster

Services & Ingresses

Deployments

Firewall Rules

Persistent Volume Claims

Google Container Registry

PubSub Topics

PubSub Subscriptions

Google Cloud Build