-
Notifications
You must be signed in to change notification settings - Fork 670
WIP: remove peers that have disappeared from kubernetes #3022
Conversation
prog/kube-peers/main.go
Outdated
// This should be sufficiently rare that we don't care. | ||
|
||
// Question: Should we check against Weave Net IPAM? | ||
// i.e. If peer X owns any address space and is marked unreachable, we want to rmpeer X |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
prog/kube-peers/main.go
Outdated
func reclaimRemovedPeers(apl *peerList, nodes []nodeInfo) error { | ||
// TODO | ||
// Outline of function: | ||
// 1. Compare peers stored in the peerList against all peers reported by k8s now. |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
- apiVersion: rbac.authorization.k8s.io/v1beta1 | ||
kind: Role | ||
metadata: | ||
name: weave-net2 |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
prog/kube-peers/main.go
Outdated
// 8. If step 5 failed due to optimistic lock conflict, stop: someone else is handling X | ||
// 9. Else there is an existing annotation at step 3; call its owner peer Y. | ||
// 10. If Y is not known to k8s and marked unreachable, recurse to remove Y at 3 | ||
// 11. If we succeed to claim the annotation for Y, remove its annotation for X too |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
It was a hack for a badly-configured Kubernetes install: it shouldn't be necessary, it will only work on an unsecured install, and it generates confusing error messages when something is really wrong.
9e83787
to
040d7b7
Compare
I have squashed some of the commits and written an implementation of the pseudo-code previously discussed. Next steps: try it out, figure out how to test it properly. |
Also still to do, as noted earlier: We need to distinguish between "stopped but coming back" and "stopped and never coming back". Possibly we could extend the use of the peerList annotation to say "If I'm not currently listed as a peer then delete any old IPAM persistence data"? |
I'm not very familiar with weave, but, why? If a node disappears and we release its IPs and etc, and if it come back later, can't we just "re-setup" it (with new IPs and all)? |
Weave Net has no central point of control: its native data structure is a CRDT. The "release its IPs" breaks the rules of the CRDT so we need some extra mechanism to avoid going mad. |
hmm, interesting. Thanks for explaining @bboreham :) |
Replaced by #3149 |
Fixes #2797
Using Kubernetes annotations as a way to ensure only one peer at a time runs
rmpeer