-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
readiness reflector #748
readiness reflector #748
Conversation
freehan
commented
May 3, 2019
•
edited
Loading
edited
- initial commit of readiness reflector
- the readiness reflector will check if the pods has NEG readiness gate.
- if so, then will poll NEG health status and patch the pod NEG readiness condition when the endpoint became ready.
- Refactoring:
- modify common NEG types to accomendate readiness gate
- move NegSyncerKey to pkg/neg/types
- Adapt NEG controller for readiness reflector:
- include non-terminating pods to add in NEG
- This will make sure if a pod is not ready due to the NEG readiness gate will be added into NEG
- adapt NEG controller and Syncer Manager to take readiness gate into account.
- adapt transaction NEG syncer to feedback into readiness reflector
- include non-terminating pods to add in NEG
7e6290f
to
48c8853
Compare
pkg/neg/readiness/poller.go
Outdated
for _, hs := range r.Healths { | ||
if hs.BackendService != nil { | ||
// This assumes the ingress backend service uses the NEG naming scheme | ||
if strings.Contains(hs.BackendService.BackendService, key.Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I am only checking the health status from the backend service managed by Ingress.
Other option is to loop thru all of them. If one is Healthy then ready. Or I can say all of them have to be Healthy. But this makes event reporting more complex. It has to report this endpoint in NEG is not ready in a BS.
f78b057
to
2227c9a
Compare
pkg/neg/readiness/interface.go
Outdated
negtypes "k8s.io/ingress-gce/pkg/neg/types" | ||
) | ||
|
||
// Reflector defines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish this comment
pkg/neg/readiness/interface.go
Outdated
|
||
// Reflector defines | ||
type Reflector interface { | ||
// Run starts up the readiness reflector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run starts the reflector.
Closing stopCh will signal the reflector to stop running.
pkg/neg/readiness/interface.go
Outdated
type Reflector interface { | ||
// Run starts up the readiness reflector | ||
Run(stopCh <-chan struct{}) | ||
// SyncPod registers the pod with reflector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this take an obj interface{}, also document what type obj is supposed to be if you are pushing the type checking into the routine.
also, syncpod does not return an error, so what happens if an object of the invalid type is passed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I use obj inteface{} mostly is because the informer can throw that into the interface directly.
I can also pass *v1.Pod, instead
pkg/neg/readiness/interface.go
Outdated
// SyncPod registers the pod with reflector | ||
SyncPod(obj interface{}) | ||
// CommitPods signals the reflector that pods has been added to one NEG | ||
CommitPods(syncerKey negtypes.NegSyncerKey, negName string, zone string, endpointMap map[negtypes.NetworkEndpoint]types.NamespacedName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document the parameters.
negName is the name of the network endpoint group in the zone (e.g. xxx)
..
endpointMap seems like it should be typedef'd
type endpointMap map[...]...
pkg/neg/readiness/interface.go
Outdated
CommitPods(syncerKey negtypes.NegSyncerKey, negName string, zone string, endpointMap map[negtypes.NetworkEndpoint]types.NamespacedName) | ||
} | ||
|
||
// NegLookup defines an interface for looking up pod membership |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NegGetter
?
pkg/neg/readiness/utils.go
Outdated
return map[negtypes.NetworkEndpoint]types.NamespacedName{} | ||
} | ||
|
||
// filterEndpoint will filter out the endpoints that does not need health polling from the input endpoint map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should call this removeIrrelevantEndpoints, as filter usually means no side effects f(set) => newSet
for endpoint, namespacedName := range endpointMap { | ||
pod, exists, err := getPodFromStore(podLister, namespacedName.Namespace, namespacedName.Name) | ||
if err != nil { | ||
klog.Warningf("Failed to retrieve pod %q from store: %v", namespacedName.String(), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a warning or something that will happen from time to time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only happen if the cache returns error. I do not think this is a regular error.
pkg/neg/readiness/utils.go
Outdated
// If pod has neg readiness gate and its condition is False, then return true. | ||
func needToProcess(pod *v1.Pod) bool { | ||
negConditionReady, readinessGateExists := evalNegReadinessGate(pod) | ||
if !readinessGateExists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return readinessGateExists && negConditionReady
push logging up one layer (don't log in functional predicate if possible, easy way to get tons of log spam)
@@ -286,31 +298,43 @@ func (s *transactionSyncer) commitTransaction(err error, networkEndpointMap map[ | |||
|
|||
for networkEndpoint := range networkEndpointMap { | |||
entry, ok := s.transactions.Get(networkEndpoint) | |||
// clear transaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: review this closer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go over this with you.
@@ -265,3 +277,31 @@ func makeEndpointBatch(endpoints negtypes.NetworkEndpointSet) (map[negtypes.Netw | |||
} | |||
return endpointBatch, nil | |||
} | |||
|
|||
func keyFunc(namespace, name string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated code
d9b6c92
to
d677bba
Compare
d677bba
to
5de93fd
Compare
I have adjusted the commits. PTAL |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, freehan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |