Data-Platform based on ClickHouse demo

This project demonstrates how to make a dataplatform that is scalable by design. When the volume of data increases, the amount of nodes and partitions / shard can easily be increased.

Design

Prerequisites:

Access to a Kubernetes cluster
- Nginx ingress controller deployed
Docker
Kubectl
Helm

Installation

Create Kind (K8s In Docker) cluster

# Create cluster with 4 worker nodes
kind create cluster --name kind-dataplatform --config=kind.yaml

# Install nginx ingress controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml

Use cluster:

kubectl config use-context kind-kind-dataplatform

Install operators

A Kubernetes Operator can deploy workloads based on Customer Resource definition that defines it. Updates to the resources will also managed by the operator.

Operators are always installed cluster-wide.

# Install Altinity Clickhouse Operator
kubectl apply -f https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-bundle.yaml

# Install Strimzi Kafka operator
helm repo add strimzi https://strimzi.io/charts/
helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator

# Install CloudNativePG PostgreSQL Operator
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.20/releases/cnpg-1.20.0.yaml

Custom images

The jupyterlab image is based on the datascience-notebook. It comes with default notebooks and install required dependencies.

docker build ./jupyterlab -t dataplatform-jupyterlab:latest
kind load docker-image dataplatform-jupyterlab:latest --name kind-dataplatform

docker build ./setup-data -t setup-data:latest
kind load docker-image  setup-data:latest --name kind-dataplatform

docker build ./data-generator -t data-generator:latest
kind load docker-image data-generator:latest --name kind-dataplatform

Install or upgrade Data-Platform

This demo in contained in a HELM chart.

helm dependency build ./dataplatform-chart
helm upgrade --install dataplatform ./dataplatform-chart --set jupyter.image=dataplatform-jupyterlab:latest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Platform based on ClickHouse demo

Design

Prerequisites:

Installation

Create Kind (K8s In Docker) cluster

Install operators

Custom images

Install or upgrade Data-Platform

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data-generator		data-generator
dataplatform-chart		dataplatform-chart
jupyterlab		jupyterlab
setup-data		setup-data
README.md		README.md
design.png		design.png
kind.yaml		kind.yaml

timselier/dataplatform-demo

Folders and files

Latest commit

History

Repository files navigation

Data-Platform based on ClickHouse demo

Design

Prerequisites:

Installation

Create Kind (K8s In Docker) cluster

Install operators

Custom images

Install or upgrade Data-Platform

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages