on.event retries #805
-
I've got perhaps an odd use-case for which I'd love a bit of input. I've perhaps gone off the reservation a bit and am using kopf as a daemonset: instead of 1-3 operator copies per cluster, I've got 1 operator per node. My custom resources are created/updated/deleted on the cluster, and rather than having an operator which owns them, all of the daemonset operators simply react to the events of those resources and perform an action on their given node. Obviously, I'm giving up some of the nice-ness of resource ownership here, and I'm totally fine with that. I've been running this project in production now for quite some time, and the only real issue I've had is what happens when my handlers fail. Per the docs:
I can understand that without persistence on the object itself, this may not be able to be handled in kopf, but I'd like to see what options I may be missing here. From my standpoint I've got:
Am I missing anything? Any input is much appreciated! Also -- thanks for kopf, it's been a fantastic tool for my team! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hello. This is quite an interesting approach! Thanks for sharing it. It is even worth mentioning it in one of the sample use-cases (when this section will be added someday). Now, to the question. Yes, the event handlers are not retried, as they have no state persistence. More on that, it is expected that all handlers except daemons, i.e. event-/state-handlers & timers finish in a relatively short time. Otherwise, the object state can change during the long handling (by a 3rd party, e.g. users) and you will have inconsistencies between what is being handled and what is stored in the cluster. This rules out the 1st way (own retrying) as a good approach. The 3rd way (daemons) might be an option. However, they also have their persistence, and so multiple operators/daemons from multiple nodes & finalizers will collide with each other. Not good. More on that, daemons put finalizers on the resource (to ensure that they can exit properly before actually releasing the resource), and they will also fight for finalizers. Not good at all. I cannot even imagine all the complications this way leads to. The 2nd way seems the most promising here: with the persistence methods. But you need a hack so that multiple operators do no overlap with each other's states. @kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
node = os.environ.get('NODE')
settings.persistence.progress_storage = kopf.AnnotationsProgressStorage(prefix=f'my-op-{node}.example.com')
settings.persistence.diffbase_storage = kopf.AnnotationsDiffBaseStorage(prefix=f'my-op-{node}.example.com')
settings.persistence.finalizer = 'my-op-{node}.example.com/kopf-finalizer' (UPD: Or the same with the Then start a pod with that env var picked from the pod's field: apiVersion: apps/v1
kind: DaemonSet
spec:
template:
spec:
containers:
- name: main
env:
- name: NODE
valueFrom:
fieldRef:
fieldPath: spec.nodeName # NB: relative to the pod's root! See: So, every instance of your operator will believe it is a separate operator and they will not collide with each other, will not do ping-pong on each other's changes, and all should work. Mind that it should be an id/ip-address/name of the node, not of the pod: the pod can be restarted several times on a single node, and you do not want the new operator pod to reprocess all its resources. Also, make sure that there are no finalizers (e.g. there are no on-deletion handlers or they are marked as optional). Otherwise, the operators will put their finalizers on the resource, but will never remove them — once the node is gone, there will be no operator believing it is their finalizer to care about. But there is a downside to this. Depending on the size of the cluster, if you have e.g. 1000 nodes, you will have 2000+ annotations on every object, sometimes 3000-4000 in a moment (temporary progress annotations are later removed). I'm not sure if there are system limits on the number and total size of annotations in Kubernetes, but the resource's yaml and operators' logs will be flooded with all this information. Instead, you can implement your own storage (diffbase & progress), where, in addition to Kopf's key, use the node's name in a primary key. Your own storage can be a Redis, Postgress, or whatever you prefer the most. If the daemonsets can have access to a host path that is shared between restarted pods on the same node, you can use a filesystem as storage. Generally, this is not a good practice, but for daemonsets, this is just fine: the storage will exist as long as the operator's pods on that node exist; once the node is gone, the storage is gone, but so are the operator's pods — i.e., whole-lifetime persistence is guaranteed; it is the "lifetime" that is redefined here. apiVersion: apps/v1
kind: DaemonSet
spec:
template:
spec:
containers:
- name: main
volumeMounts:
- mountPath: /op-data
name: op-data
volumes:
- name: op-data
hostPath:
path: /op-data See: For that, inherit from
import kopf
class FSDiffBaseStorage(kopf.DiffBaseStorage):
def fetch(self, *, body: kopf.Body) -> Optional[kopf.BodyEssence]:
uid = body.metadata.uid
path = os.path.join('/op-data', f'{uid}.json')
if not os.path.exists(path):
return None
with open(path, 'rt', encoding='utf-8') as f:
return json.loads(f.read())
def store(self, *, body: kopf.Body, patch: kopf.Patch, essence: kopf.BodyEssence) -> None:
uid = body.metadata.uid
path = os.path.join('/op-data', f'{uid}.json')
with open(path, 'wt', encoding='utf-8') as f:
f.write(json.dumps(essence))
class FSProgressStorage(kopf.ProgressStorage):
... # fetch, store, purge, touch
@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
settings.persistence.progress_storage = FSProgressStorage()
settings.persistence.diffbase_storage = FSDiffBaseStorage() |
Beta Was this translation helpful? Give feedback.
-
For anyone stumbling on this a few years later, the prefixed annotation progress_storage worked perfectly for our use case. I ended up making use of the node name pass via downward api, and truncating it lightly since it made our annotation keys too long in certain environments.
|
Beta Was this translation helpful? Give feedback.
Hello. This is quite an interesting approach! Thanks for sharing it. It is even worth mentioning it in one of the sample use-cases (when this section will be added someday).
Now, to the question.
Yes, the event handlers are not retried, as they have no state persistence. More on that, it is expected that all handlers except daemons, i.e. event-/state-handlers & timers finish in a relatively short time. Otherwise, the object state can change during the long handling (by a 3rd party, e.g. users) and you will have inconsistencies between what is being handled and what is stored in the cluster. This rules out the 1st way (own retrying) as a good approach.
The 3rd way (daemons) might be an opt…