-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Antrea-IPAM] Support pre-allocating continuous IPs for StatefulSet #3281
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3281 +/- ##
===========================================
- Coverage 65.51% 53.58% -11.93%
===========================================
Files 277 392 +115
Lines 27500 43040 +15540
===========================================
+ Hits 18016 23063 +5047
- Misses 7567 17655 +10088
- Partials 1917 2322 +405
Flags with carried forward coverage won't be shown. Click here to find out more.
|
bdf284c
to
bfe3f41
Compare
bfe3f41
to
d62620d
Compare
d62620d
to
de41bf4
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-integration |
Would appreciate feedback - we need to distinguish between |
@annakhm Please use |
de41bf4
to
4f96f3f
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not do a thorough review of this PR, but I would trust existing reviews. So, you need not my approval to merge it.
3a299cc
to
5abb050
Compare
/test-all |
All e2e failed. |
/test-e2e |
/test-flexible-ipam-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-windows-all |
/test-flexible-ipam-e2e |
/test-windows-proxyall-e2e |
key := k8s.NamespacedName(ss.Namespace, ss.Name) | ||
|
||
c.statefulSetMap.Store(key, ss) | ||
c.statefulSetQueue.Add(addEventIndication + key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @jianjuns. And there are some other potential issues:
- workqueue is meaningless here if it just needs a FIFO, in which case channel is more proper.
- As the key in workqueue has a "a" or "d" prefix, one object's add and delete event will be two items in workqueue, then it's not possible to increase the number of workers that handle statefulSet, which may not scale well.
- It cannot support replicas number change, as
processNextStatefulSetWorkItem
processes the statefulset using the "snapshot" of its spec. - As there is only single worker today, processing the statefulSet based on "snapshot" is more error-prone. For example, if multiple statefulSets are recreated, the worker has to process deletion of the statefulSets serially, then their creation, during which Pods of the new statefulSets may have been created, the work may release the IP allocations by mistake when it handles the deletion event of a statefulSet, leading to duplicate allocation.
- With single worker, creation of multiple statefulSets may be handled slower than kubelet creating Pods, leading to many discontinuous IP allocation in practice.
- As your comment points out, transient errors when processing delete event will lead to IP leak and can only be fixed until controller restarts, which is not very robust.
All controllers need to consider the recreate case, and the typical controller pattern still works for them. I think it's appliable to IPAM as well.
What the controller does is to ensure statefulSet's replica == number of continuous IPs in IPPool
, and we don't really care UID of statefulset. Then even after the statefulSet is deleted and recreated, we don't really need to delete all IP allocations from the IPPool first then re-allocate.
It should work if implementing the controller in this way:
- The event handlers only need to enqueue the key (namespace + name) of the statefulSet to the queue, regardless of it's add/update/delete events (some update events can be ingored if fields we care don't change).
- The worker gets the key from the queue, checks the latest spec in the lister.
- If the statefulSet doesn't exist, it cleans up all IP allocation of this statefulSet.
- If the statefulSet exists, it allocates or releases IPs to make sure the number of IP allocation equal to the number of replicas. This is quite similar to ReplicaSetController which ensures the number of Pods equal to the number of ReplicaSet's replica.
In this way, it can run multiple workers to process statefulSet concurrently, can handle replica number change, and prevents introducing another storage and potential race condition coming with it.
In order to provide better user experience, AtreamIPAM will try to reserve continuous IP range for StatefulSet. If unsuccesful, IPs for the StatfulSet will be allocated on the fly, as before. Signed-off-by: Anna Khmelnitsky <[email protected]>
5abb050
to
1f7c8a5
Compare
/test-all |
/test-flexible-ipam-e2e |
} | ||
|
||
size := int(*ss.Spec.Replicas) | ||
err = allocator.AllocateStatefulSet(ss.Namespace, ss.Name, size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we checked no IP allocated for this StatefulSet already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And add comments somewhere to indicate we do not support IPPool changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllocateStatefulSet has this check inside: IPs will not be preallocated if this StatefulSet is already present in the pool: https://github.com/antrea-io/antrea/pull/3281/files#diff-5e4e20e1e087d03ded8c46066bd42db715fc14fa73a9e4ac6fcfeab83e8d7f4cR412
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And could we also add a comment about this then? It is just easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a note in docs that explains that IPPool annotation can not be changed without recreating the resource: Note that the IP pool annotation cannot be updated or deleted without recreating the resource
. We should add this to validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is great. But still feel better to add a comment in the code, so reviewers/readers like me can easily understand why we do not check if pool changes or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment in code
Thank you again @tnqn and @jianjuns, indeed when refactored according to your suggestions, the code is more straightforward and less prone to race conditions. @tnqn please note that since the goal is allocation of continuous IP range, today we don't resize preallocations based on replica number. Scaling up is unlikely to succeed (since next IP is likely to be taken), scaling down is possible though. |
e2989d0
to
996ec5e
Compare
/test-flexible-ipam-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, some minor comments
Signed-off-by: Anna Khmelnitsky <[email protected]>
996ec5e
to
00dfb7e
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-flexible-ipam-e2e |
In order to provide better user experience, AtreamIPAM will try to
preallocate continuous IP range for StatefulSet. If unsuccesful,
IPs for the StatfulSet will be allocated on the fly, as before.
Signed-off-by: Anna Khmelnitsky [email protected]