Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow namespace scoped operator #719

Merged
merged 10 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/e2e-leg-2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ jobs:
export OPERATOR_IMG=${{ inputs.operator-image }}
export VLOGGER_IMG=${{ inputs.vlogger-image }}
export E2E_TEST_DIRS=tests/e2e-leg-2
# All of the tests in this leg will use a namespace scoped operator
export CONTROLLERS_SCOPE=namespace
export VERTICA_DEPLOYMENT_METHOD=${{ inputs.vertica-deployment-method }}
if [ "${VERTICA_DEPLOYMENT_METHOD}" != "vclusterops" ]; then E2E_TEST_DIRS+=" tests/e2e-leg-2-at-only"; fi
export E2E_TEST_DIRS=tests/e2e-leg-2
Expand Down
41 changes: 6 additions & 35 deletions DEVELOPER.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,44 +236,15 @@ make generate manifests

## Run the VerticaDB operator

You have two options to run the VerticaDB operator:

- [Local](#local-operator): run the operator synchronously in a shell environment.
- [Deployment object](#deployment-object): Package the operator in a container and deploy in Kubernetes as a deployment object.

The operator is cluster-scoped for both options, so it monitors CRs in all namespaces.

### Local operator

> **NOTE**
> When you run the operator locally, you cannot run [e2e tests](#e2e-tests). You can run only [ad-hoc tests](#unit-tests).

The local deployment option is the fastest method to get the operator up and running, but it has limitations:

- A local operator does not accurately simulate how the operator runs in a Kubernetes environment.
- The webhook is disabled. The webhook requires TLS certs that are available only when the operator is packaged in a container.

#### Install

To run the operator in the shell, enter the following command:

```shell
make install run
```

#### Stop the operator

To stop the operator, press **Ctrl + C**.

### Deployment object

When you run the operator as a deployment object, it runs in a container in a real Kubernetes environment.
In order to run the operator, you must run it inside Kubernetes by packaging it in a container. It cannot run standalone outside of Kubernetes.

Vertica on Kubernetes supports two deployment models: Helm chart and [Operator Lifecycle Manager (OLM)](https://olm.operatorframework.io/). You specify the deployment model with the `DEPLOY_WITH` environment variable in the `make` command. By default, the operator is deployed in the `verticadb-operator` namespace. If that namespace does not exists, it creates it if necessary.

By default, the operator is cluster-scoped, meaning it monitors CRs in all namespaces. But when deployed with helm, it can be run as namespace scoped as well by setting the `scope` parameter to `namespace`.

The operator pod contains a webhook, which requires TLS certificates. The TLS setup for each deployment model is different.

#### Helm deployment
### Helm deployment

Deploy the operator with Helm and all its prerequisites:

Expand All @@ -283,7 +254,7 @@ DEPLOY_WITH=helm make config-transformer deploy

The operator generates a self-signed TLS certificate at runtime. You can also provide a custom TLS certificate. For details, see `webhook.certSource` in [Helm chart parameters](https://docs.vertica.com/latest/en/containerized/db-operator/helm-chart-parameters/).

#### OLM deployment
### OLM deployment

You must configure OLM deployments when you run an operator with a webhook. For details, see the [OLM documentation](https://olm.operatorframework.io/docs/advanced-tasks/adding-admission-and-conversion-webhooks/).

Expand All @@ -293,7 +264,7 @@ Deploy OLM and all its prerequisites:
DEPLOY_WITH=olm make setup-olm deploy
```

#### Remove the operator
### Remove the operator

The `undeploy` make target removes the operator from the environment. The following command removes both Helm and OLM deployments:

Expand Down
71 changes: 63 additions & 8 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -143,9 +143,6 @@ HELM_RELEASE_NAME?=vdb-op
# For example to specify a custom webhook tls cert when deploying use this command:
# HELM_OVERRIDES="--set webhook.tlsSecret=custom-cert" make deploy-operator
HELM_OVERRIDES?=
# Enables development mode by default. Is used only when the operator is deployed
# through the Makefile
DEV_MODE?=true
# Maximum number of tests to run at once. (default 2)
# Set it to any value not greater than 8 to override the default one
E2E_PARALLELISM?=2
Expand All @@ -157,11 +154,65 @@ E2E_TEST_DIRS?=tests/e2e-leg-1
# Additional arguments to pass to 'kubectl kuttl'
E2E_ADDITIONAL_ARGS?=

#
# Deployment Variables
# ====================
#
# The following set of variables get passed down to the operator through a
# configMap. Each variable that we export here is included in the config/
# kustomize bundle (see config/manager/operator-env).
#
# Specify how to deploy the operator. Allowable values are 'helm' or 'olm'.
# When deploying with olm, it is expected that `make setup-olm` has been run
# already.
DEPLOY_WITH?=helm
export DEPLOY_WITH
#
# Set this to allow us to enable/disable the controllers in the operator.
# Disabling the controller will force the operator just to serve webhook
# requests.
CONTROLLERS_ENABLED?=true
export CONTROLLERS_ENABLED
#
# Set this to control if the webhook is enabled or disabled in the operator.
WEBHOOKS_ENABLED?=true
export WEBHOOKS_ENABLED
#
# Use this to control what scope the controller is deployed at. It supports two
# values:
# - cluster - controllers are cluster scoped and will watch for objects in any
# namespace
# - namespace - controllers are scoped to a single namespace and will watch for
# objects in the namespace where the manager is deployed.
CONTROLLERS_SCOPE?=cluster
export CONTROLLERS_SCOPE
#
# The address the operators Prometheus metrics endpoint binds to. Setting this
# to 0 will disable metric serving.
METRICS_ADDR?=127.0.0.1:8080
export METRICS_ADDR
#
# Set this to enable the memory profiler. Enables runtime profiling collection.
# The profiling data can be inspected by connecting to port 6060
#"with the path /debug/pprof. See https://golang.org/pkg/net/http/pprof/ for more info.
PROFILER_ENABLED?=false
export PROFILER_ENABLED
#
# The minimum logging level. Valid values are: debug, info, warn, and error.
LOG_LEVEL?=info
export LOG_LEVEL
#
# The operators concurrency with each CR. If the number is > 1, this means the
# operator can reconcile multiple CRs at the same time. Note, the operator never
# parallelizes reconcile iterations for the same CR. Only distinct CRs can be
# reconciled in parallel.
CONCURRENCY_VERTICADB?=5
CONCURRENCY_VERTICAAUTOSCALER?=1
CONCURRENCY_EVENTTRIGGER?=1
export CONCURRENCY_VERTICADB \
CONCURRENCY_VERTICAAUTOSCALER \
CONCURRENCY_EVENTTRIGGER

# Clear this variable if you don't want to wait for the helm deployment to
# finish before returning control. This exists to allow tests to attempt deploy
# when it should fail.
Expand Down Expand Up @@ -320,10 +371,6 @@ setup-olm: operator-sdk bundle docker-build-bundle docker-push-bundle docker-bui
build: manifests generate fmt vet ## Build manager binary.
go build -o bin/manager cmd/operator/main.go

.PHONY: run
run: manifests generate fmt vet ## Run a controller from your host.
scripts/run-operator.sh

.PHONY: docker-build-operator
docker-build-operator: manifests generate fmt vet ## Build operator docker image with the manager.
docker pull golang:${GO_VERSION} # Ensure we have the latest Go lang version
Expand Down Expand Up @@ -512,7 +559,7 @@ uninstall: manifests kustomize ## Uninstall CRDs from the K8s cluster specified
# If this secret does not exist then it is simply ignored.
deploy-operator: manifests kustomize ## Using helm or olm, deploy the operator in the K8s cluster
ifeq ($(DEPLOY_WITH), helm)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set logging.dev=${DEV_MODE} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred --set controllers.scope=$(CONTROLLERS_SCOPE) $(HELM_OVERRIDES)
scripts/wait-for-webhook.sh -n $(NAMESPACE) -t 60
else ifeq ($(DEPLOY_WITH), olm)
scripts/deploy-olm.sh -n $(NAMESPACE) $(OLM_TEST_CATALOG_SOURCE)
Expand All @@ -521,6 +568,14 @@ else
$(error Unknown deployment method: $(DEPLOY_WITH))
endif

deploy-webhook: manifests kustomize ## Using helm, deploy just the webhook in the k8s cluster
ifeq ($(DEPLOY_WITH), helm)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) --create-namespace $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES) --set webhook.enable=true,controllers.enable=false
scripts/wait-for-webhook.sh -n $(NAMESPACE) -t 60
else
$(error Unsupported deployment method for webhook only: $(DEPLOY_WITH))
endif

.PHONY: undeploy-operator
undeploy-operator: ## Undeploy operator that was previously deployed
scripts/undeploy.sh $(if $(filter false,$(ignore-not-found)),,-i)
Expand Down
5 changes: 5 additions & 0 deletions changes/unreleased/Added-20240227-150500.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
kind: Added
body: Allow namespace scoped operator deployment
time: 2024-02-27T15:05:00.67819016-04:00
custom:
Issue: "719"
5 changes: 5 additions & 0 deletions changes/unreleased/Removed-20240227-150540.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
kind: Removed
body: Removed most of the logging helm chart parameters. Only logging.level still exists.
time: 2024-02-27T15:05:40.510018124-04:00
custom:
Issue: "719"
103 changes: 38 additions & 65 deletions cmd/operator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,8 @@ package main

import (
"context"
"flag"
"fmt"
"log"
"os"
"strconv"
"time"

// Allows us to pull in things generated from `go generate`
Expand Down Expand Up @@ -80,33 +77,20 @@ func init() {
//+kubebuilder:scaffold:scheme
}

// getIsWebhookEnabled will return true if the webhook is enabled
func getIsWebhookEnabled() bool {
const DefaultEnabled = true
const EnableWebhookEnv = "ENABLE_WEBHOOKS"
enableWebhook, found := os.LookupEnv(EnableWebhookEnv)
if !found {
return DefaultEnabled
}
enabled, err := strconv.ParseBool(enableWebhook)
setupLog.Info(fmt.Sprintf("Parsed %s env var", enableWebhook),
"value", enableWebhook, "enabled", enabled, "err", err)
if err != nil {
return DefaultEnabled
}
return enabled
}

// addReconcilersToManager will add a controller for each CR that this operator
// handles. If any failure occurs, if will exit the program.
func addReconcilersToManager(mgr manager.Manager, restCfg *rest.Config, oc *opcfg.OperatorConfig) {
func addReconcilersToManager(mgr manager.Manager, restCfg *rest.Config) {
if !opcfg.GetIsControllersEnabled() {
setupLog.Info("Controllers are disabled")
return
}

if err := (&vdb.VerticaDBReconciler{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("VerticaDB"),
Scheme: mgr.GetScheme(),
Cfg: restCfg,
EVRec: mgr.GetEventRecorderFor(vmeta.OperatorName),
OpCfg: *oc,
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "VerticaDB")
os.Exit(1)
Expand Down Expand Up @@ -158,30 +142,27 @@ func addWebhooksToManager(mgr manager.Manager) {
}

// setupWebhook will setup the webhook in the manager if enabled
func setupWebhook(ctx context.Context, mgr manager.Manager, restCfg *rest.Config, oc *opcfg.OperatorConfig) error {
if getIsWebhookEnabled() {
ns, err := getOperatorNamespace()
if err != nil {
return fmt.Errorf("failed to setup the webhook: %w", err)
}
if oc.WebhookCertSecret == "" {
func setupWebhook(ctx context.Context, mgr manager.Manager, restCfg *rest.Config) error {
if opcfg.GetIsWebhookEnabled() {
ns := opcfg.GetOperatorNamespace()
if opcfg.GetWebhookCertSecret() == "" {
setupLog.Info("generating webhook cert")
if err := security.GenerateWebhookCert(ctx, &setupLog, restCfg, CertDir, oc.PrefixName, ns); err != nil {
if err := security.GenerateWebhookCert(ctx, &setupLog, restCfg, CertDir, opcfg.GetPrefixName(), ns); err != nil {
return err
}
} else if val, ok := os.LookupEnv(vmeta.OperatorDeploymentMethodEnvVar); ok && val == vmeta.OLMDeploymentType {
} else if opcfg.GetIsOLMDeployment() {
// OLM will generate the cert themselves and they have their own
// mechanism to update the webhook configs and conversion webhook in the CRD.
setupLog.Info("OLM deployment detected. Skipping webhook cert update")
} else if !oc.UseCertManager {
setupLog.Info("using provided webhook cert", "secret", oc.WebhookCertSecret)
if err := security.PatchWebhookCABundleFromSecret(ctx, &setupLog, restCfg, oc.WebhookCertSecret,
oc.PrefixName, ns); err != nil {
} else if !opcfg.GetUseCertManager() {
setupLog.Info("using provided webhook cert", "secret", opcfg.GetWebhookCertSecret())
if err := security.PatchWebhookCABundleFromSecret(ctx, &setupLog, restCfg, opcfg.GetWebhookCertSecret(),
opcfg.GetPrefixName(), ns); err != nil {
return err
}
} else {
setupLog.Info("using cert-manager for webhook cert")
if err := security.AddCertManagerAnnotation(ctx, &setupLog, restCfg, oc.PrefixName, ns); err != nil {
if err := security.AddCertManagerAnnotation(ctx, &setupLog, restCfg, opcfg.GetPrefixName(), ns); err != nil {
return err
}
}
Expand All @@ -192,41 +173,33 @@ func setupWebhook(ctx context.Context, mgr manager.Manager, restCfg *rest.Config
return nil
}

// getOperatorNamespace retrieves the namespace that the operator is running in
func getOperatorNamespace() (string, error) {
const namespaceEnvVar = "OPERATOR_NAMESPACE"
ns, found := os.LookupEnv(namespaceEnvVar)
if !found {
return "", fmt.Errorf("the environment variable %s must be set", namespaceEnvVar)
}
return ns, nil
}

// getReadinessProbeCallack returns the check to use for the readiness probe
func getReadinessProbeCallback(mgr ctrl.Manager) healthz.Checker {
// If the webhook is enabled, we use a checker that tests if the webhook is
// able to accept requests.
if getIsWebhookEnabled() {
if opcfg.GetIsWebhookEnabled() {
return mgr.GetWebhookServer().StartedChecker()
}
return healthz.Ping
}

func main() {
oc := &opcfg.OperatorConfig{}
oc.SetFlagArgs()
flag.Parse()

logger := oc.GetLogger()
if oc.FilePath != "" {
log.Printf("Now logging in file %s", oc.FilePath)
logger := opcfg.GetLogger()
if opcfg.GetLoggingFilePath() != "" {
log.Printf("Now logging in file %s", opcfg.GetLoggingFilePath())
}

ctrl.SetLogger(logger)
setupLog.Info("Build info", "gitCommit", GitCommit,
"buildDate", BuildDate, "vclusterVersion", VClusterVersion)
setupLog.Info("Operator Config",
"controllersScope", opcfg.GetControllersScope(),
"version", opcfg.GetVersion(),
"watchNamespace", opcfg.GetWatchNamespace(),
"webhooksEnabled", opcfg.GetIsWebhookEnabled(),
"controllersEnabled", opcfg.GetIsControllersEnabled())

if oc.EnableProfiler {
if opcfg.GetIsProfilerEnabled() {
go func() {
server := &http.Server{
Addr: "localhost:6060",
Expand All @@ -243,18 +216,18 @@ func main() {

mgr, err := ctrl.NewManager(restCfg, ctrl.Options{
Scheme: scheme,
MetricsBindAddress: oc.MetricsAddr,
MetricsBindAddress: opcfg.GetMetricsAddr(),
Port: 9443,
HealthProbeBindAddress: oc.ProbeAddr,
LeaderElection: oc.EnableLeaderElection,
LeaderElectionID: "5c1e6227.vertica.com",
Namespace: "", // Empty namespace means watch all namespaces
HealthProbeBindAddress: ":8081",
LeaderElection: true,
LeaderElectionID: opcfg.GetLeaderElectionID(),
Namespace: opcfg.GetWatchNamespace(),
CertDir: CertDir,
Controller: v1alpha1.ControllerConfigurationSpec{
GroupKindConcurrency: map[string]int{
vapiB1.GkVDB.String(): oc.VerticaDBConcurrency,
vapiB1.GkVAS.String(): oc.VerticaAutoscalerConcurrency,
vapiB1.GkET.String(): oc.EventTriggerConcurrency,
vapiB1.GkVDB.String(): opcfg.GetVerticaDBConcurrency(),
vapiB1.GkVAS.String(): opcfg.GetVerticaAutoscalerConcurrency(),
vapiB1.GkET.String(): opcfg.GetEventTriggerConcurrency(),
},
},
})
Expand All @@ -263,9 +236,9 @@ func main() {
os.Exit(1)
}

addReconcilersToManager(mgr, restCfg, oc)
addReconcilersToManager(mgr, restCfg)
ctx := ctrl.SetupSignalHandler()
if err := setupWebhook(ctx, mgr, restCfg, oc); err != nil {
if err := setupWebhook(ctx, mgr, restCfg); err != nil {
setupLog.Error(err, "unable to setup webhook")
os.Exit(1)
}
Expand Down
Loading
Loading