Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRD changes for 2.1.0 release of the operator #728

Merged
merged 42 commits into from
Mar 6, 2024
Merged

CRD changes for 2.1.0 release of the operator #728

merged 42 commits into from
Mar 6, 2024

Conversation

spilchen
Copy link
Collaborator

@spilchen spilchen commented Mar 5, 2024

This merges the vnext branch into the main branch. We utilized this branch to keep CRD changes separate until they were prepared for release. As a result, it contains numerous modifications. The following is a summary of the changes:

  • To enable database-level restoration from a restore point, the VerticaDB CRD has added new parameters to allow the specification of a restore point archive for revive. For instance,
apiVersion: vertica.com/v1
kind: VerticaDB
metadata:
  name: verticadb-sample
spec:
  image: "opentext/vertica-k8s:24.2.0-0-minimal"
  communal:
	path: "s3://mybucket/vertica"
	endpoint: https://s3.amazonaws.com
  subclusters:
	- name: main
  	size: 3
  initPolicy: Revive
  restorePoint:
	archive: backup
	index: 1	 
  • VerticaRestorePointsQuery is a new CRD that enables querying communal storage to view existing restore points. Archive and timestamp filtering are supported by this CRD. It is compatible with Vertica server versions 24.2.0 and later. For instance, you may create the following CR to see all the restore points for the backup archive in the database at VerticaDB verticadb-sample:
apiVersion: vertica.com/v1beta1
kind: VerticaRestorePointsQuery
metadata:
  name: verticarestorepointsquery-sample
spec:
  verticaDBName: verticadb-sample
  filterOptions:
	archiveName: backup   
  • The VerticaScrutinize CRD allows you to execute and gather data from the scrutinize command against a live database. This eliminates the need to run the scrutinize command in the Vertica server pods. The scrutinize tarball is stored in a volume, and a pod remains active for a period of time, allowing you to inspect and extract the tarball's contents. Note that this feature is only available for Vertica servers running version 24.2.0 or later. For instance, create the following CR to run scrutinize against the database specified in the verticadb-sample VerticaDB CR:
apiVersion: vertica.com/v1beta1
kind: VerticaScrutinize
metadata:
  name: verticascrutinize-sample
spec:
  verticaDBName: verticadb-sample

jizhuoyu and others added 30 commits December 20, 2023 10:38
- add `RestorePointPolicy` type in `Spec`
  - `Archive` must be specified if a restore is intended
- one of `ID` and `Index` must be specified (`Index` assumed to be
1-based)
- webhook rules to validate `RestorePointPolicy`
- revivedb_reconciler checks whether restore is supported given the
server version and deployment method
  - must use vclusterops and has server version greater than v24.2.0

---------

Co-authored-by: Roy Paulin <[email protected]>
This PR build out the stub for an empty controller to handle the
VerticaRestorePointsQuery API. It does not include a webhook, as there
are no defined rules for transitioning the Custom Resource (CR), given
that the spec portion contains only two fields. The operator can observe
the new API, initiate a reconciliation iteration, and take no action, as
we have set nil for the actors during this implementation phase

---------

Co-authored-by: Matt Spilchen <[email protected]>
- Fix webhook bug so that negative index values are denied
- Fix revivedb reconciler so that only the first letter of the first
word is captitalized

---------

Co-authored-by: Roy Paulin <[email protected]>
- pass restorePoint configuration from VerticaDB CR to vclusterops while
we revive
- new e2e test for reviving from a restore point
…ry cr (#658)

In the VerticaRestorePointsQuery CRD we have status conditions to know
when the query has started and when it has finished.

  conditions:
  - lastTransitionTime: <ts>
    status: "True"
    type: Querying
  - lastTransitionTime: <ts>
    status: "True"
    type: QueryComplete

This PR can add the status condition logic as follows:
- skip reconcile if QueryComplete is true
- set Querying status condition prior to calling vclusterops API
- clear Querying status condition after calling API. Include an error
message in the status condition if the API fails.
- set the QueryComplete if the vclusterops API suceeded
This PR will fetch the VerticaDB to collect this information from a
VerticaDB for the new VerticaRestorePointsQuery CR, and extract out the
config information to pass down to the vclusterops API.

---------

Co-authored-by: Matt Spilchen <[email protected]>
- operator now handles ReviveDBRestorePointNotFound error returned from
vclusterops
- e2e test for invalid id and index
- e2e test for cluster scaling before restoring

---------

Co-authored-by: Matt Spilchen <[email protected]>
This PR adds custom print columns for the new VerticaRestorePointsQuery
CR, so that a kubectl get vrpq command tells you if the query has
completed.
This PR build out the stub for an empty controller to handle the
VerticaScrutinize API. For now The operator can observe the new API,
initiate a reconciliation iteration, and take no action, as we have set
nil for the actors during this implementation phase
As we discussed, this task is divided into two parts. This PR completed
the first part, which involved calling the vclusterops API to perform
the restore points query. A follow-up task to copy the result of the
query into the CRs status fields will be processed later
This PR represents the second piece of implementing controller logic for
VerticaRestorePointsQuery. It involves copying the result of the query
into the status fields of VerticaRestorePointsQuery CR
This includes a few updates to the vcluster API

- Update to function signature for VReviveDB
- Update Restore Point struct to make status fields not required
This adds the logic to generate a dummy pod when a VSCR CR is created.
It does the following:
  -  checks if the VerticaDB specified in Vscr exists
- checks the server version and abort reconciliation if version is too
old
  - creates a pod

For now the pod does not run scrutinize, the logic for that will be
added on a follow-up PR. The pod composed of 2 containers, an init
container (that will collect scrutinize container later) and a main
container. Both run hard-coded image (busybox) and respectively sleep
for 5s and for infinity.
We also set some fields(initContainers, Volume,...) with values from
Vscr.
The control-plane: controller-manager label changed in the last merge.
Updating test cases that exist only on the vnext branch.
- update vrpq CRD to support filtering options for archive name, and
start and end timestamps
- update e2e test to test for the archive name filter and timestamps
filters

---------

Co-authored-by: spilchen <[email protected]>
…#700)

This PR adds some checks for the new VerticaRestorePointsQuery CR. The
VerticaDB should be running with a minimum version of 24.2.0 and should
be deployed with vclusterOps. We don't support admintools deployments
- add e2e test for date-only timestamps

---------

Co-authored-by: spilchen <[email protected]>
chinhtranvan and others added 10 commits February 17, 2024 12:46
This PR creates a webhook for the new VerticaRestorePointQuery CR to
catch the checks early. There is some webhook part that is added in
#643."
When a VerticaScrutinize resource is created, the operator:
- collects some info from the specified VerticaDB in order to construct
vcluster scrutinize command arguments
- builds the scrutinize podspec. That pod will shares the same
serviceaccount, podsecuritycontext, image as vertica pods.
- adds some env vars to the podspec needed by nma and also 2 new env
vars(`PASSWORD_NAMESPACE`, `PASSWORD_NAME`) to allow vcluster scrutinize
to read the db password from secret.

Once the pod is created, a new reconciler(`PodPollingReconciler`) will
wait for scrutinize to finish and update status condition
This pulls in the latest changes from main into vnext. There are a lot
of changes here because of the copyright year change.

---------

Signed-off-by: GitHub <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: spilchen <[email protected]>
this give users a way to set the main container resources through
annotations. The main container resources remain unset unless the
resources are set for the init container too.

The annotations are as follow:
```
vertica.com/scrutinize-main-container-resources-limits-cpu
vertica.com/scrutinize-main-container-resources-limits-memory
vertica.com/scrutinize-main-container-resources-requests-cpu
vertica.com/scrutinize-main-container-resources-requests-memory
```
This caches the tarballName in our VSCR status after a successful
scrutinize run. The generated tarball name follow a specific format
needed by `grasp`. To access the tarball name between reconcilers, we
cache it in the scrutinize pod annotations.
@spilchen spilchen self-assigned this Mar 5, 2024
Matt Spilchen and others added 2 commits March 5, 2024 10:52
Now when you have a VerticaScrutinize CR created, if you do kubectl get,
you will get info about the scrutinize pod name and the state of the
scrutinize run. The state is `Ready`, `PodCreated`,
`ScrutinizeInProgress`, `ScrutinizeSucceeded` if successful and
`NotReady:<reason>`, `PodCreationFailed`, `ScrutinizeFailed` in case of
a failed run.

kubectl get vscr will look like this:
```
NAME                       STATE                 POD                        AGE
verticascrutinize-sample   ScrutinizeSucceeded   verticascrutinize-sample   84m
```
@spilchen spilchen merged commit 2414ace into main Mar 6, 2024
60 checks passed
@spilchen spilchen deleted the vnext branch March 6, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants