Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

WIP: 2797 should recover ips on peer loss #3171

Closed
wants to merge 2 commits into from

Conversation

bricef
Copy link
Contributor

@bricef bricef commented Nov 14, 2017

Currently, weave will recover missing peers on relaunch, as per #3149. However, there remains some issues with updates (see #3170).

Furthermore, weave will not currently deal properly with peers going down while running. The recovery of the IP space will only occur after relaunching a weave agent.

In an ideal world, the IP space would be dynamically recovered and re-distributed. This PR includes a failing test to that purpose.

@bboreham
Copy link
Contributor

"IP space dynamically recovered" is a niche benefit - we only really need to recover at the time we run out.

We could have some modest background task where any peer can say at any time "I perceive that I have 1% of the address space and someone else has 90%; I will ask for some more" - that would help with "re-distribute" and also lessen the impact of a delay in reclaiming.

We also want to avoid gratuitously fragmenting the overall space. Although that may be ok as a consequence of existing heuristics.

@bboreham bboreham changed the title 2797 should recover ips on peer loss WIP: 2797 should recover ips on peer loss Nov 14, 2017
@bricef
Copy link
Contributor Author

bricef commented Nov 15, 2017

I think I get your point. Unreachable or badly distributed addresses aren't a problem unless they affect function. In most autoscaling scenarios, the launch of a new instance would recover unreachable slices anyway.

I wonder if this cleanup and management should be triggered when weave is asked to provide a new address to a user service. We'd be doing work at this point anyway, it would be triggered by user action, and it would avoid having a background process running anyway. That way, weave can say,

  1. I need a new address
  2. I don't have any available
  3. Are there unreachable hosts I can recover?
  4. If not, are there hosts with a slice I could steal?

Maybe this would have too much of an effect on latency?

@bboreham
Copy link
Contributor

Since the reclaim process is (currently) highly Kubernetes-specific, coupling #3 to #1 is problematic.

1, 2 and 4 are what IPAM does already, although it calls it "request" rather than "steal".

@brb
Copy link
Contributor

brb commented Jan 6, 2018

Is it still WIP?

@bboreham
Copy link
Contributor

I just realised I am pointing other issues at this one, but this is a PR not an issue.

@bboreham
Copy link
Contributor

bboreham commented Nov 1, 2018

Replaced by #3399

@bboreham bboreham closed this Nov 1, 2018
@bboreham bboreham added this to the n/a milestone May 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants