Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP ranges cleanup #118

Merged
merged 16 commits into from
Jul 22, 2021
Merged

Conversation

maiqueb
Copy link
Collaborator

@maiqueb maiqueb commented Jul 5, 2021

This PR adds a cronjob that runs every 5 minutes, and checks the consistency of the IP addresses currently assigned by the pools.

Depends-on: #113

TODO:

  • retrieve a list of Pods once instead of checking N times if pods exist
  • ensure the ip-reconciler runs at most a little bit less that the provided cron expression
  • missing unit tests (I'd have to mock the datastore, and I didn't want to establish that precedent...) but it is tested e2e
    • I'll rework the public API I'm proposing so I can unit test this without requiring mocking. I've got a fairly good idea on how to do it
    • the "find orphaned" part misses unit tests, since I'd need to mock the datastore. Refactoring the code to enable that will require plenty of changes, which'll make this PR harder to review; imho, it can happen in a follow up.

@coveralls
Copy link

coveralls commented Jul 5, 2021

Pull Request Test Coverage Report for Build 1056425921

  • 66 of 207 (31.88%) changed or added relevant lines in 6 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+11.4%) to 40.866%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/storage/etcd.go 0 6 0.0%
pkg/reconciler/wrappedPod.go 32 40 80.0%
pkg/allocate/allocate.go 0 10 0.0%
cmd/reconciler/ip.go 0 28 0.0%
pkg/reconciler/iploop.go 32 121 26.45%
Files with Coverage Reduction New Missed Lines %
pkg/allocate/allocate.go 1 66.5%
Totals Coverage Status
Change from base Build 1052931641: 11.4%
Covered Lines: 434
Relevant Lines: 1062

💛 - Coveralls

@crandles crandles closed this Jul 5, 2021
@crandles crandles reopened this Jul 5, 2021
@crandles
Copy link
Collaborator

crandles commented Jul 5, 2021

Sorry! reviewing this while mobile

Comment on lines +73 to +62
BeforeEach(func() {
Expect(k8sClientSet.CoreV1().Pods(namespace).Delete(pod.Name, &metav1.DeleteOptions{})).NotTo(HaveOccurred())
})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, I probably shouldn't even start up the pod; just the pool pointing at it.

@maiqueb
Copy link
Collaborator Author

maiqueb commented Jul 6, 2021

Hey; any clue on what went wrong generating the openshift image, @dougbtv ? https://github.com/k8snetworkplumbingwg/whereabouts/pull/118/checks?check_run_id=2992364690

@lgtm-com
Copy link

lgtm-com bot commented Jul 12, 2021

This pull request introduces 1 alert when merging 731e1ff into f1047ee - view on LGTM.com

new alerts:

  • 1 for Useless assignment to local variable

Copy link
Member

@dougbtv dougbtv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Miguel, my hat is off to you. This is incredible. I'm ready to move forward with it. I have a nit for a doc update. Let's see if we can get you another review too. But, I have to say I'm particularly happy with this PR, major thanks.

RunSpecs(t, "Reconcile IP address allocation in the system")
}

// mock the pool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like the mock pool technique, great one.

doc/ip-reconciler-job.yaml Show resolved Hide resolved

func (rl ReconcileLooper) isPodAlive(podRef string) bool {
for _, livePodRef := range rl.livePodRefs {
if podRef == livePodRef {
Copy link
Contributor

@TothFerenc TothFerenc Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check whether the running Pod really owns the reserved IP.

StatefulSet Pods keep their names whenever they are recreated, so it might happen that a StatefulSet Pod got new IP in the meantime, while previous IP is still reserved in IPPool pointing to the same running Pod, so the previous IP is unused, but won't be released by this logic.

Copy link
Collaborator Author

@maiqueb maiqueb Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatefulSet Pods keep their names whenever they are recreated, so it might happen that a StatefulSet Pod got new IP in the meantime, while previous IP is still reserved in IPPool pointing to the same running Pod, so the previous IP is unused, but won't be released by this logic.

Really good point, I hadn't considered that scenario - at all. Thanks for raising it.

I've added a couple of commits to handle this particular scenario - the implementation I'm proposing only tackles secondary interfaces (plugged by multus).

Hope this limitation is not a deal breaker for you.

I'll account for @dougbtv 's comments next, and also document the reconciler's limitations described above.

EDIT: nudging for attention @TothFerenc

To preserve the existing code base, we embed the new type within
the KubernetesIPAM (old type).

In a follow-up commit, this new struct will be used on a separate
golang binary (which doesn't require the KubernetesIPAM
configuration), to provide a simple connection to the Kubernetes
backend (does not support etcd).

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Fow now, we only read the kubeconfig file, which is specified via
the cmd line - as its only argument.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Introduce a dedicated constructor in the storage pkg to create the
newly created `KubernetesClient` struct. The existing
`KubernetesIPAM` struct will re-use this new constructor.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Follow-up commits will use this new method for debugging, which
makes it simpler to track the IP to pod assignments.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Currently, matching the owner of the IP when deallocating the
address happens only based on the containerID.

In follow-up commits we will require this behavior to happen based
on the pod reference attribute.

This commit prepares the code base for that, by enabling the caller
to inject the criteria for identifying the owner of an IP into the
IP address de-allocation function.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
@maiqueb maiqueb force-pushed the ip-ranges-cleanup branch 4 times, most recently from 385e34e to fa3455d Compare July 22, 2021 13:56
This behavior will be encapsulated in a new struct named
`ReconcileLooper`, which features methods for:
  - CancelContext; cancels the current execution context
  - FindOrphanedIPsPerPool; returns a list of the orphaned
    IP addresses, indexed by pool
  - ReconcileIPPool; receives a list of orphaned IP
    addresses, and cleans them up from the respective pools
    where they are currently allocated.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
The service account for whereabouts requires the ability of
listing the pods on all namespaces, so it can check which ones
feature orphaned IPs.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Only mark a pod as "alive" when the pod's annotations feature the
IP being de-allocated.

This makes the reconciler binary *dependent* on multus, which adds
these `network-status` annotations into the pod.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
By not returning an `IPReservation` entry - we now return an IP
address instead - we simplify the code.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
Signed-off-by: Miguel Duarte Barroso <[email protected]>
This commit changes the reconciler logic to only parse the
network-status annotations of whereabouts pods - e.g. pods whose
pod referecences (<namespace>/<pod name>) feature in any of the
whereabouts IPPools.

Signed-off-by: Miguel Duarte Barroso <[email protected]>
Copy link
Member

@dougbtv dougbtv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome one. Working great in my environment. Thank you Miguel!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants