This is a Kubernetes operator which tries to help managing checkpoints.
Kubernetes 1.25 introduced the possibility to create stateful checkpoints of container.
One of the early questions asked was how to control how many checkpoint archives will be on the local disk. Having too many checkpoint archives might not be useful, depending on the use case, but it might lead to a situation where no more disk space is left due to a high number of checkpoint archives or if the checkpoint archives are really large.
With the help of this operator it is possible to limit the number of checkpoint
archives per namespace/pod/container. The maximum number of checkpoints per
namespace/pod/container combination is 10
but it can be changed.
The operator has the following parameters which can be changed:
maxCheckpointsPerContainer
: how many checkpoint archives should be kept on disk for a certain namespace/pod/container combination. Defaults to10
checkpointDirectory
: the directory where checkpoint archives are created and which should be watched for the correct number of checkpoint archives. Defaults to kubelet's default checkpoint location/var/lib/kubelet/checkpoints
.
A sample file can be found at config/samples/_v1_checkpointrestoreoperator.yaml
for setting these parameters.
You’ll need a Kubernetes cluster to run against. You can use
KIND to get a local cluster for testing, or run
against a remote cluster.
Note: Your controller will automatically use the current context in your
*kubeconfig file (i.e. whatever cluster kubectl cluster-info
shows).
-
Install Instances of Custom Resources:
make install
-
Build and push your image to the location specified by
IMG
:make docker-build docker-push IMG=<some-registry>/checkpoint-restore-operator:tag
-
Deploy the controller to the cluster with the image specified by
IMG
:make deploy IMG=<some-registry>/checkpoint-restore-operator:tag
To delete the CRDs from the cluster:
make uninstall
UnDeploy the controller from the cluster:
make undeploy
While bug fixes can first be identified via an "issue", that is not required. It's ok to just open up a PR with the fix, but make sure you include the same information you would have included in an issue - like how to reproduce it.
PRs for new features should include some background on what use cases the new code is trying to address. When possible and when it makes sense, try to break-up larger PRs into smaller ones - it's easier to review smaller code changes. But only if those smaller ones make sense as stand-alone PRs.
Regardless of the type of PR, all PRs should include:
- well documented code changes;
- additional testcases: ideally, they should fail w/o your code change applied;
- documentation changes.
Squash your commits into logical pieces of work that might want to be reviewed separate from the rest of the PRs. Ideally, each commit should implement a single idea, and the PR branch should pass the tests at every commit. GitHub makes it easy to review the cumulative effect of many commits; so, when in doubt, use smaller commits.
PRs that fix issues should include a reference like Closes #XXXX
in the
commit message so that github will automatically close the referenced issue
when the PR is merged.
Contributors must assert that they are in compliance with the Developer Certificate of Origin 1.1. This is achieved by adding a "Signed-off-by" line containing the contributor's name and e-mail to every commit message. Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch.
-
Install the CRDs into the cluster:
make install
-
Run your controller (this will run in the foreground, so switch to a new terminal if you want to leave it running):
make run
NOTE: You can also run this in one step by running: make install run
If you are editing the API definitions, generate the manifests such as CRs or CRDs using:
make manifests
NOTE: Run make --help
for more information on all potential make
targets
More information can be found via the Kubebuilder Documentation
Unless mentioned otherwise in a specific file's header, all code in this project is released under the Apache 2.0 license.
The author of a change remains the copyright holder of their code (no copyright assignment). The list of authors and contributors can be retrieved from the git commit history and in some cases, the file headers.