on.event retries #805

stevenolen · 2021-07-21T12:45:08Z

stevenolen
Jul 21, 2021

I've got perhaps an odd use-case for which I'd love a bit of input. I've perhaps gone off the reservation a bit and am using kopf as a daemonset: instead of 1-3 operator copies per cluster, I've got 1 operator per node. My custom resources are created/updated/deleted on the cluster, and rather than having an operator which owns them, all of the daemonset operators simply react to the events of those resources and perform an action on their given node.

Obviously, I'm giving up some of the nice-ness of resource ownership here, and I'm totally fine with that. I've been running this project in production now for quite some time, and the only real issue I've had is what happens when my handlers fail.

Per the docs:

If the event handler fails, the error is logged to the operator’s log, and then ignored.

I can understand that without persistence on the object itself, this may not be able to be handled in kopf, but I'd like to see what options I may be missing here. From my standpoint I've got:

Implement all my retry logic within the handler, do not exit the handler until success.
The configurable state persistence methods seem rather intriguing to me, but I don't think that regardless of my persistence method, kopf will take any persistence action for events
Reimplement my handlers as daemons and take on more of the lifecycle (I'm not sure if this has any benefit over 3, since the work that must be performed after an event is really a single action)
A way to crash kopf when a handler fails -- this is a pretty big hammer but my operator can handle re-scanning of all objects, just a bit of wasted time/cpu.

Am I missing anything? Any input is much appreciated! Also -- thanks for kopf, it's been a fantastic tool for my team!

Answered by nolar

Jul 24, 2021

Hello. This is quite an interesting approach! Thanks for sharing it. It is even worth mentioning it in one of the sample use-cases (when this section will be added someday).

Now, to the question.

Yes, the event handlers are not retried, as they have no state persistence. More on that, it is expected that all handlers except daemons, i.e. event-/state-handlers & timers finish in a relatively short time. Otherwise, the object state can change during the long handling (by a 3rd party, e.g. users) and you will have inconsistencies between what is being handled and what is stored in the cluster. This rules out the 1st way (own retrying) as a good approach.

The 3rd way (daemons) might be an opt…

View full answer

nolar · 2021-07-24T07:34:05Z

nolar
Jul 24, 2021
Maintainer

Hello. This is quite an interesting approach! Thanks for sharing it. It is even worth mentioning it in one of the sample use-cases (when this section will be added someday).

Now, to the question.

Yes, the event handlers are not retried, as they have no state persistence. More on that, it is expected that all handlers except daemons, i.e. event-/state-handlers & timers finish in a relatively short time. Otherwise, the object state can change during the long handling (by a 3rd party, e.g. users) and you will have inconsistencies between what is being handled and what is stored in the cluster. This rules out the 1st way (own retrying) as a good approach.

The 3rd way (daemons) might be an option. However, they also have their persistence, and so multiple operators/daemons from multiple nodes & finalizers will collide with each other. Not good. More on that, daemons put finalizers on the resource (to ensure that they can exit properly before actually releasing the resource), and they will also fight for finalizers. Not good at all. I cannot even imagine all the complications this way leads to.

The 2nd way seems the most promising here: with the persistence methods. But you need a hack so that multiple operators do no overlap with each other's states.

@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
    node = os.environ.get('NODE')
    settings.persistence.progress_storage = kopf.AnnotationsProgressStorage(prefix=f'my-op-{node}.example.com')
    settings.persistence.diffbase_storage = kopf.AnnotationsDiffBaseStorage(prefix=f'my-op-{node}.example.com')
    settings.persistence.finalizer = 'my-op-{node}.example.com/kopf-finalizer'

(UPD: Or the same with the kopf.Status...Storage() — if it is your custom resource and its status schema is configured accordingly to accept arbitrary fields.)

Then start a pod with that env var picked from the pod's field:

apiVersion: apps/v1
kind: DaemonSet
spec:
  template:
    spec:
      containers:
        - name: main
          env:
            - name: NODE
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName  # NB: relative to the pod's root!

See:

Expose Pod Information to Containers Through Environment Variables

So, every instance of your operator will believe it is a separate operator and they will not collide with each other, will not do ping-pong on each other's changes, and all should work.

Mind that it should be an id/ip-address/name of the node, not of the pod: the pod can be restarted several times on a single node, and you do not want the new operator pod to reprocess all its resources.

Also, make sure that there are no finalizers (e.g. there are no on-deletion handlers or they are marked as optional). Otherwise, the operators will put their finalizers on the resource, but will never remove them — once the node is gone, there will be no operator believing it is their finalizer to care about.

But there is a downside to this. Depending on the size of the cluster, if you have e.g. 1000 nodes, you will have 2000+ annotations on every object, sometimes 3000-4000 in a moment (temporary progress annotations are later removed). I'm not sure if there are system limits on the number and total size of annotations in Kubernetes, but the resource's yaml and operators' logs will be flooded with all this information.

Instead, you can implement your own storage (diffbase & progress), where, in addition to Kopf's key, use the node's name in a primary key. Your own storage can be a Redis, Postgress, or whatever you prefer the most.

If the daemonsets can have access to a host path that is shared between restarted pods on the same node, you can use a filesystem as storage. Generally, this is not a good practice, but for daemonsets, this is just fine: the storage will exist as long as the operator's pods on that node exist; once the node is gone, the storage is gone, but so are the operator's pods — i.e., whole-lifetime persistence is guaranteed; it is the "lifetime" that is redefined here.

apiVersion: apps/v1
kind: DaemonSet
spec:
  template:
    spec:
      containers:
        - name: main
          volumeMounts:
            - mountPath: /op-data
              name: op-data
      volumes:
        - name: op-data
           hostPath:
           path: /op-data

See:

https://kubernetes.io/docs/concepts/storage/volumes/#hostpath

For that, inherit from kopf.ProgressStorage and kopf.DiffBaseStorage and implement the fetch-store-purge methods — just as simple as in a key-value database, usually a few lines per method.

import kopf

class FSDiffBaseStorage(kopf.DiffBaseStorage):
    def fetch(self, *, body: kopf.Body) -> Optional[kopf.BodyEssence]:
        uid = body.metadata.uid
        path = os.path.join('/op-data', f'{uid}.json')
        if not os.path.exists(path):
            return None
        with open(path, 'rt', encoding='utf-8') as f:
            return json.loads(f.read())

    def store(self, *, body: kopf.Body, patch: kopf.Patch, essence: kopf.BodyEssence) -> None:
        uid = body.metadata.uid
        path = os.path.join('/op-data', f'{uid}.json')
        with open(path, 'wt', encoding='utf-8') as f:
            f.write(json.dumps(essence))

class FSProgressStorage(kopf.ProgressStorage):
    ...  # fetch, store, purge, touch

@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
    settings.persistence.progress_storage = FSProgressStorage()
    settings.persistence.diffbase_storage = FSDiffBaseStorage()

2 replies

stevenolen Oct 6, 2021
Author

apologies for the lengthy delay in responding...but @nolar, this is absolutely phenomenal feedback! Our clusters tend not to be particularly large, so the unique-annotation-per-node idea might work, but the custom persistence storage is likely to be the most consistent/reliable, so it'll likely be where I head.

Thanks again!

nolar Oct 6, 2021
Maintainer

@stevenolen You are welcome! ;-)

stevenolen · 2023-08-04T14:45:24Z

stevenolen
Aug 4, 2023
Author

For anyone stumbling on this a few years later, the prefixed annotation progress_storage worked perfectly for our use case. I ended up making use of the node name pass via downward api, and truncating it lightly since it made our annotation keys too long in certain environments.

DEFAULT_NODE_NAME_ENV_VAR = 'KUBE_NODE_NAME'

_NODE_NAME = os.environ.get(DEFAULT_NODE_NAME_ENV_VAR, '')


def truncate_node_name(node_name):
    """Shortens node name to first dot-separated part. Required when using longer hostnames in annotations."""
    return node_name.split('.')[0]


@kopf.on.startup()
async def on_startup(settings: kopf.OperatorSettings, **_kwargs):
    node_name = truncate_node_name(_NODE_NAME)
    settings.persistence.progress_storage = kopf.AnnotationsProgressStorage(prefix=f'<DOMAIN>.{node_name}')
    settings.persistence.diffbase_storage = kopf.AnnotationsDiffBaseStorage(
        prefix=f'<DOMAIN>.{node_name}',
        key='last-handled-configuration',
    )

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on.event retries #805

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

on.event retries #805

stevenolen Jul 21, 2021

Replies: 2 comments · 2 replies

nolar Jul 24, 2021 Maintainer

stevenolen Oct 6, 2021 Author

nolar Oct 6, 2021 Maintainer

stevenolen Aug 4, 2023 Author

stevenolen
Jul 21, 2021

Replies: 2 comments 2 replies

nolar
Jul 24, 2021
Maintainer

stevenolen Oct 6, 2021
Author

nolar Oct 6, 2021
Maintainer

stevenolen
Aug 4, 2023
Author