Skip to content

Commit

Permalink
Merge pull request #53 from Fei-Guo/master
Browse files Browse the repository at this point in the history
Move virtualcluster directory from WG repo to CAPN
  • Loading branch information
k8s-ci-robot authored May 7, 2021
2 parents e9cd910 + dbcdf7c commit 6f1e5b0
Show file tree
Hide file tree
Showing 369 changed files with 61,755 additions and 0 deletions.
25 changes: 25 additions & 0 deletions virtualcluster/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

# Binaries for programs and plugins
*.exe
*.exe~
*.dll
*.so
*.dylib
_output
coverage

# Test binary, build with `go test -c`
*.test

# Output of the go coverage tool, specifically when used with LiteIDE
*.out

# Kubernetes Generated files - skip generated files, except for vendored files

vendor/

# editor and IDE paraphernalia
.idea
*.swp
*.swo
*~
23 changes: 23 additions & 0 deletions virtualcluster/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Build the manager binary
FROM golang:1.12 as builder

ENV GO111MODULE=on

WORKDIR /go/virtualcluster

COPY go.mod .
COPY go.sum .

RUN go mod download

COPY pkg/ pkg/
COPY cmd/ cmd/

# Build
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o manager sigs.k8s.io/cluster-api-provider-nested/virtualcluster/cmd/manager

# Copy the controller-manager into a thin image
FROM ubuntu:latest
WORKDIR /
COPY --from=builder /go/virtualcluster/manager .
ENTRYPOINT ["/manager"]
174 changes: 174 additions & 0 deletions virtualcluster/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Explicitly opt into go modules, even though we're inside a GOPATH directory
export GO111MODULE=on

# Image URL to use all building/pushing image targets
DOCKER_REG ?= ${or ${VC_DOCKER_REGISTRY},"virtualcluster"}
IMG ?= ${DOCKER_REG}/manager-amd64 ${DOCKER_REG}/vn-agent-amd64 ${DOCKER_REG}/syncer-amd64

# TEST_FLAGS used as flags of go test.
TEST_FLAGS ?= -v --race

# COVERAGE_PACKAGES is the coverage we care about.
COVERAGE_PACKAGES=$(shell go list ./... | \
grep -v sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/client | \
grep -v sigs.k8s.io/cluster-api-provider-nested/virtualcluster/pkg/apis | \
grep -v sigs.k8s.io/cluster-api-provider-nested/virtualcluster/cmd | \
grep -v sigs.k8s.io/cluster-api-provider-nested/virtualcluster/test/e2e)

# CRD_OPTIONS ?= "crd:trivialVersions=true"
CRD_OPTIONS ?= "crd:trivialVersions=true,maxDescLen=0"

# Build code.
#
# Args:
# WHAT: Directory names to build. If any of these directories has a 'main'
# package, the build will produce executable files under $(OUT_DIR).
# If not specified, "everything" will be built.
# GOFLAGS: Extra flags to pass to 'go' when building.
# GOLDFLAGS: Extra linking flags passed to 'go' when building.
# GOGCFLAGS: Additional go compile flags passed to 'go' when building.
#
# Example:
# make
# make all
# make all WHAT=cmd/kubelet GOFLAGS=-v
# make all GOLDFLAGS=""
# Note: Specify GOLDFLAGS as an empty string for building unstripped binaries, which allows
# you to use code debugging tools like delve. When GOLDFLAGS is unspecified, it defaults
# to "-s -w" which strips debug information. Other flags that can be used for GOLDFLAGS
# are documented at https://golang.org/cmd/link/
.PHONY: all
all: test build

build:
hack/make-rules/build.sh $(WHAT)

# Run tests
.PHONY: test
PWD = $(CURDIR)
test: generate fmt vet manifests
@mkdir -p coverage
@( for pkg in ${COVERAGE_PACKAGES}; do \
go test ${TEST_FLAGS} \
-coverprofile=coverage/unit-test-`echo $$pkg | tr "/" "_"`.out \
$$pkg || exit 1 ;\
done )
@( cd ./pkg/vn-agent/server/test; \
go test ${TEST_FLAGS} \
-coverprofile=${PWD}/coverage/unit-test-pkg_vn-agent_server_test.out )
@cd ${PWD}

.PHONY: coverage
coverage: ## combine coverage after test
@mkdir -p coverage
@gocovmerge coverage/* > coverage/coverage.txt
@go tool cover -html=coverage/coverage.txt -o coverage/coverage.html

.PHONY: clean
clean: ## clean to remove bin/* and files created by module
@go mod tidy
@rm -rf _output/*
@rm -rf coverage/*

# Run against the configured Kubernetes cluster in ~/.kube/config
run: generate fmt vet
go run ./cmd/manager/main.go

# Install CRDs into a cluster
install: manifests
kubectl apply -f config/crds

# Deploy controller in the configured Kubernetes cluster in ~/.kube/config
deploy: manifests
kubectl apply -f config/crds
kustomize build config/default | kubectl apply -f -

# Generate manifests e.g. CRD, RBAC etc.
manifests: controller-gen
$(CONTROLLER_GEN) $(CRD_OPTIONS) rbac:roleName=manager-role paths="./..." output:crd:artifacts:config=config/crds
hack/make-rules/replace-null.sh
# To work around a known controller gen issue
# https://github.com/kubernetes-sigs/kubebuilder/issues/1544
ifeq (, $(shell which yq))
@echo "Please install yq for yaml patching. Get it from here: https://github.com/mikefarah/yq"
@exit
else
@{ \
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.apiServer.properties.statefulset.properties.spec.properties.template.properties.spec.properties.containers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.controllerManager.properties.statefulset.properties.spec.properties.template.properties.spec.properties.containers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.etcd.properties.statefulset.properties.spec.properties.template.properties.spec.properties.containers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.apiServer.properties.statefulset.properties.spec.properties.template.properties.spec.properties.initContainers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.controllerManager.properties.statefulset.properties.spec.properties.template.properties.spec.properties.initContainers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.etcd.properties.statefulset.properties.spec.properties.template.properties.spec.properties.initContainers.items.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.apiServer.properties.service.properties.spec.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.controllerManager.properties.service.properties.spec.properties.ports.items.required[1]" protocol;\
yq w -i config/crds/tenancy.x-k8s.io_clusterversions.yaml "spec.validation.openAPIV3Schema.properties.spec.properties.etcd.properties.service.properties.spec.properties.ports.items.required[1]" protocol;\
}
endif

# Run go fmt against code
fmt:
go fmt ./pkg/... ./cmd/...

# Run go vet against code
vet:
go vet ./pkg/... ./cmd/...

# Generate code
generate: controller-gen
ifndef GOPATH
$(error GOPATH not defined, please define GOPATH. Run "go help gopath" to learn more about GOPATH)
endif
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./..."

# Build release image.
#
# 1. run tests
# 2. build docker image
.PHONY: release-images
release-images: test build-images

# Build docker image.
#
# 1. build all binaries.
# 2. copy binaries to the corresponding docker image.
build-images:
hack/make-rules/release-images.sh $(WHAT)

# Push the docker image
docker-push:
$(foreach i,$(IMG),docker push $i;)

# find or download controller-gen
# download controller-gen if necessary
controller-gen:
ifeq (, $(shell which controller-gen))
@{ \
set -e ;\
CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
cd $$CONTROLLER_GEN_TMP_DIR ;\
go mod init tmp ;\
go get sigs.k8s.io/controller-tools/cmd/[email protected] ;\
rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
}
CONTROLLER_GEN=$(GOPATH)/bin/controller-gen
else
CONTROLLER_GEN=$(shell which controller-gen)
endif

# Build and run kubernetes e2e tests.
#
# Args:
# KUBECONFIG: kubeconfig to virtual cluster. If empty, create a virtual cluster.
# Defaults to "".
# FOCUS: Regexp that matches the tests to be run. Defaults to "\[Conformance\]".
# SKIP: Regexp that matches the tests that needs to be skipped.
# Defaults to "\[Flaky\]|\[Slow\]|\[Serial\]"
# BUILD_DEPENDENCIES: if true, build dependencies related to e2e test.
# Defaults to true.
#
# Example:
# make test-e2e-k8s KUBECONFIG=/path/to/vc-kubeconfig
.PHONY: test-e2e-k8s
test-e2e-k8s:
hack/make-rules/test-e2e-k8s.sh
15 changes: 15 additions & 0 deletions virtualcluster/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# See the OWNERS docs: https://git.k8s.io/community/contributors/guide/owners.md

approvers:
- adohe
- Fei-Guo
- resouer
- tashimi
- zhuangqh

reviewers:
- adohe
- christopherhein
- Fei-Guo
- resouer
- zhuangqh
4 changes: 4 additions & 0 deletions virtualcluster/PROJECT
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
version: "1"
domain: x-k8s.io
projectName: virtualcluster
repo: sigs.k8s.io/cluster-api-provider-nested/virtualcluster
124 changes: 124 additions & 0 deletions virtualcluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# VirtualCluster - Enabling Kubernetes Hard Multi-tenancy

VirtualCluster represents a new architecture to address various Kubernetes control plane isolation challenges.
It extends existing namespace based Kubernetes multi-tenancy model by providing each tenant a cluster view.
VirtualCluster completely leverages Kubernetes extendability and preserves full API compatibility.
That being said, the core Kubernetes components are not modified in virtual cluster.

With VirtualCluster, each tenant is assigned a dedicated tenant control plane, which is a upstream Kubernetes distribution.
Tenants can create cluster scope resources such as namespaces and CRDs in the tenant control plane without affecting others.
As a result, most of the isolation problems due to sharing one apiserver disappear.
The Kubernetes cluster that manages the actual physical nodes is called a super cluster, which now
becomes a Pod resource provider. VirtualCluster is composed of the following components:

- **vc-manager**: A new CRD [VirtualCluster](pkg/apis/tenancy/v1alpha1/virtualcluster_types.go) is introduced
to model the tenant control plane. `vc-manager` manages the lifecycle of each `VirtualCluster` custom resource.
Based on the specification, it either creates CAPN control plane Pods in local K8s cluster,
or imports an existing cluster if a valid `kubeconfig` is provided.

- **syncer**: A centralized controller that populates API objects needed for Pod provisioning from every tenant control plane
to the super cluster, and bidirectionally syncs the object statuses. It also periodically scans the synced objects to ensure
the states between tenant control plane and super cluster are consistent.

- **vn-agent**: A node daemon that proxies all tenant kubelet API requests to the kubelet process that running
in the node. It ensures each tenant can only access its own Pods in the node.

With all above, from the tenant’s perspective, each tenant control plane behaves like an intact Kubernetes with nearly full API capabilities.
For more technical details, please check our [ICDCS 2021 paper.](./doc/vc-icdcs.pdf)

## Live Demos/Presentations

Kubecon EU 2020 talk (~25 mins) | WG meeting demo (~50 mins)
--- | ---
[![](http://img.youtube.com/vi/5RgF_dYyvEY/0.jpg)](https://www.youtube.com/watch?v=5RgF_dYyvEY "vc-kubecon-eu-2020") | [![](http://img.youtube.com/vi/Kow00IEUbAA/0.jpg)](http://www.youtube.com/watch?v=Kow00IEUbAA "vc-demo-long")

## Quick Start

Please follow the [instructions](./doc/demo.md) to install VirtualCluster in your local K8s cluster.

## Abstraction

In VirtualCluster, tenant control plane owns the source of the truth for the specs of all the synced objects.
The exceptions are persistence volume, storage class and priority class resources whose source of the truth is the super cluster.
The syncer updates the synced object's status in each tenant control plane,
acting like a regular resource controller. This abstraction model means the following assumptions:
- The synced object spec _SHOULD_ not be altered by any arbitrary controller in the super cluster.
- Tenant master owns the lifecycle management for the synced object. The synced objects _SHOULD NOT_ be
managed by any controllers (e.g., StatefulSet) in the super cluster.

If any of the above assumptions is violated, VirtualCluster may not work as expected. Note that this
does not mean that a cluster administrator cannot install webhooks, for example, a sidecar webhook,
in the super cluster. Those webhooks will still work but the changes are going
to be hidden to tenants. Alternatively, those webhooks can be installed in tenant control planes so that
tenants will be aware of all changes.

## Limitations

Ideally, tenants should not be aware of the existence of the super cluster in most cases.
There are still some noticeable differences comparing a tenant control plane and a normal Kubernetes cluster.

- In the tenant control plane, node objects only show up after tenant Pods are created. The super cluster
node topology is not fully exposed in the tenant control plane. This means the VirtualCluster does not support
`DaemonSet` alike workloads in tenant control plane. Currently, the syncer controller rejects a newly
created tenant Pod if its `nodename` has been set in the spec.

- The syncer controller manages the lifecycle of the node objects in tenant control plane but
it does not update the node lease objects in order to reduce network traffic. As a result,
it is recommended to increase the tenant control plane node controller `--node-monitor-grace-period`
parameter to a larger value ( >60 seconds, done in the sample clusterversion
[yaml](config/sampleswithspec/clusterversion_v1_nodeport.yaml) already).

- Coredns is not tenant-aware. Hence, tenant should install coredns in the tenant control plane if DNS is required.
The DNS service should be created in the `kube-system` namespace using the name `kube-dns`. The syncer controller can then
recognize the DNS service's cluster IP in super cluster and inject it into any Pod `spec.dnsConfig`.

- The cluster IP field in the tenant service spec is a bogus value. If any tenant controller requires the
actual cluster IP that takes effect in the super cluster nodes, a special handling is required.
The syncer will backpopulate the cluster IP used in the super cluster in the
annotations of the tenant service object using `transparency.tenancy.x-k8s.io/clusterIP` as the key.
Then, the workaround usually is going to be a simple code change in the controller.
This [document](./doc/tenant-dns.md) shows an example for coredns.

- VirtualCluster does not support tenant PersistentVolumes. All PVs and Storageclasses are provided by the super cluster.

VirtualCluster passes most of the Kubernetes conformance tests. One failing test asks for supporting
`subdomain` which cannot be easily done in the VirtualCluster.

## FAQ

### Q: What is the difference between VirtualCluster and multi-cluster solution?

One of the primary design goals of VirtualCluster is to improve the overall resource utilization
of a super cluster by allowing multiple tenants to share the node resources in a control plane isolated manner.
A multi-cluster solution can achieve the same isolation goal but resources won't be shared causing
nodes to have lower utilization.

### Q: Can the tenant control plane run its own scheduler?

VirtualCluster was primarily designed for serverless use cases where users normally do not have
scheduling preferences. Using the super cluster scheduler can much easily
achieve good overall resource utilization. For these reasons,
VirtualCluster does not support tenant scheduler. It is technically possible
to support tenant scheduler by exposing some of the super cluster nodes directly in
tenant control plane. Those nodes have to be dedicated to the tenant to avoid any scheduling
conflicts. This type of tenant should be exceptional.

### Q: What is the difference between Syncer and Virtual Kubelet?

They have similarities. In some sense, the syncer controller can be viewed as the replacement of a virtual
kubelet in cases where the resource provider of the virtual kubelet is a Kubernetes cluster. The syncer
maintains the one to one mapping between a virtual node in tenant control plane and a real node
in the super cluster. It preserves the Kubernetes API compatibility as closely as possible. Additionally,
it provides fair queuing to mitigate tenant contention.

## Release

The first release is coming soon.

## Community
VirtualCluster is a SIG cluster-api-provider-nested (CAPN) supporting project.
If you have any questions or want to contribute, you are welcome to file issues or pull requests.

You can also directly contact VirtualCluster maintainers via the WG [slack channel](https://kubernetes.slack.com/messages/wg-multitenancy).

Lead developer: @Fei-Guo([email protected])
Loading

0 comments on commit 6f1e5b0

Please sign in to comment.