Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Not removing unreachable peers due to lock from nonexistent peer #3386

Closed
bboreham opened this issue Aug 29, 2018 · 0 comments · Fixed by #3416
Closed

Not removing unreachable peers due to lock from nonexistent peer #3386

bboreham opened this issue Aug 29, 2018 · 0 comments · Fixed by #3416

Comments

@bboreham
Copy link
Contributor

Opening a new issue to make the conversation clear; this follows from #3310 (comment) but I want to address the part relating to unreachable peers and leave #3310 focused on "attempting to claim same IP range".

The log makes it clear why the peers were not removed:

DEBU: 2018/08/28 20:26:35.539292 [kube-peers] Nodes that have disappeared: map[ip-10-80-78-181.ec2.internal:{da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal} ip-10-80-91-173.ec2.internal:{96:be:47:91:4d:4b ip-10-80-91-173.ec2.internal}]
DEBU: 2018/08/28 20:26:35.539319 [kube-peers] Preparing to remove disappeared peer {da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal}
DEBU: 2018/08/28 20:26:35.539334 [kube-peers] Existing annotation 36:50:e4:04:ea:fa
DEBU: 2018/08/28 20:26:35.539352 [kube-peers] Preparing to remove disappeared peer {96:be:47:91:4d:4b ip-10-80-91-173.ec2.internal}
DEBU: 2018/08/28 20:26:35.539360 [kube-peers] Noting I plan to remove  96:be:47:91:4d:4b
DEBU: 2018/08/28 20:26:35.547349 weave DELETE to http://127.0.0.1:6784/peer/96:be:47:91:4d:4b with map[]
INFO: 2018/08/28 20:26:35.553231 [kube-peers] rmpeer of 96:be:47:91:4d:4b: 135880 IPs taken over from 96:be:47:91:4d:4b

DEBU: 2018/08/28 20:26:35.574340 [kube-peers] Nodes that have disappeared: map[ip-10-80-78-181.ec2.internal:{da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal}]
DEBU: 2018/08/28 20:26:35.574392 [kube-peers] Preparing to remove disappeared peer {da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal}
DEBU: 2018/08/28 20:26:35.574404 [kube-peers] Existing annotation 36:50:e4:04:ea:fa
[...]
DEBU: 2018/08/28 20:26:35.705345 [kube-peers] Nodes that have disappeared: map[ip-10-80-78-181.ec2.internal:{da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal}]
DEBU: 2018/08/28 20:26:35.705377 [kube-peers] Preparing to remove disappeared peer {da:3e:d4:a5:e9:42 ip-10-80-78-181.ec2.internal}
DEBU: 2018/08/28 20:26:35.705390 [kube-peers] Existing annotation 36:50:e4:04:ea:fa
[...]

it says that another peer, ID 36:50:e4:04:ea:fa, has "locked" that record to clean it up. We can only allow one peer to clean up at a time, as described in #2797.

Now, peer 36:50:e4:04:ea:fa is not in evidence anywhere in the log, except in the "Existing annotation" messages. So the lock persists forever. There are 3,709 of those lines, all citing 36:50:e4:04:ea:fa as the owner, which is puzzling.

@bboreham bboreham added this to the 2.4.1 milestone Aug 30, 2018
@bboreham bboreham modified the milestones: 2.4.1, 2.4.2 Sep 13, 2018
murali-reddy added a commit that referenced this issue Sep 27, 2018
…but no longer

exists hence lock persists foever, fix makes a peer own the reclaim when nonexistent node found

Fixes #3386
murali-reddy added a commit that referenced this issue Sep 27, 2018
…) but no longer

exists hence lock persists foever, fix makes a peer own the reclaim when nonexistent node found

Fixes #3386
@bboreham bboreham modified the milestones: 2.4.2, 2.5 Oct 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant