Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on setting pod status #5474

Closed
zenbones opened this issue Sep 24, 2023 · 15 comments
Closed

Question on setting pod status #5474

zenbones opened this issue Sep 24, 2023 · 15 comments
Labels

Comments

@zenbones
Copy link

zenbones commented Sep 24, 2023

Given my attempt to update pod status...

      for (Pod pod : client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).list().getItems()) {
        pod.setStatus(ReadinessGateUtiltity.updatePodStatus(readinessMap));
        client.pods().inNamespace(component.getNamespace()).resource(pod).patch();
      }

I assume patching the change is necessary, but doing that yields...

Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://172.20.0.1:443/api/v1/namespaces/prod3/pods/prod3-epicenter-arch-pv5b4. Message: Operation cannot be fulfilled on pods "prod3-epicenter-arch-pv5b4": the object has been modified; please apply your changes to the latest version and try again.

Am I wrong and merely setting the status updates the pod, or should I be looking for another asynchronous operation altering the pod while I'm trying to update the status?

Any help appreciated

@shawkins
Copy link
Contributor

shawkins commented Sep 24, 2023

Generally you do not want to modify pod status, or any status of a resource, for which you are not acting as the controller. The actual controller will likely undo whatever you try to change.

The exception is telling you that the resource has already been modified by the time you are attempting the operation. Generally you will want to retry, or perform the operation unlocked (removing the resourceVersion, or in newer client versions using the unlock method).

Finally there are separate methods for patching / updating status - see patchStatus and updateStatus - because status is a separate subresource. Just calling patch or update will only affect the main resource fields under metadata, spec, etc.

@manusa manusa added question Waiting on feedback Issues that require feedback from User/Other community members labels Sep 25, 2023
@zenbones
Copy link
Author

zenbones commented Sep 28, 2023

I am the controller, but thank you for that advice. I'll look into patchStatus/updateStatus, which is maybe what I should be calling. I suspected that the exception was telling me the resource had been modified (that was clear from the error). I was surprised that this was the case as I am getting the pod and immediately updating, and I assume the get would return the latest version. I was wondering if setStatus() on the pod was itself causing the version change.

Would this be more correct?

      for (Pod pod : client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).list().getItems()) {
        pod.setStatus(new PodStatusBuilder().withConditions(conditions).build());
        client.pods().inNamespace(component.getNamespace()).resource(pod).patchStatus();
      }

@shawkins
Copy link
Contributor

shawkins commented Sep 28, 2023

I am the controller, but thank you for that advice.

You are the controller for pods, not kubernetes? Or do you mean that you are creating the pod resource?

I assume the get would return the latest version

At that point in time yes, but if something else is acting upon the resource (for example kubernetes itself will act upon pods), then there's no guarentee it won't change by the time your next call goes through.

I was wondering if setStatus() on the pod was itself causing the version change.

It should not. In fact, if all you are doing is setting the status and calling patch, the api server will see that as a no-op because nothing about the spec, metadata, etc. is changing. But it still has to perform the version check first.

@zenbones
Copy link
Author

zenbones commented Sep 28, 2023

  1. This code is in an operator which is responsible for creating the pods, so I am creating the pod resource, yes.
  2. I understand the asynchronous nature of what I'm doing, but the error was very consistent and the time window small, so I came to assume it was me.
  3. But that assumption may be wrong given your last comment.

Would this be more correct in terms of updating the status?

      for (Pod pod : 
client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).list().getItems()) {
        pod.setStatus(new PodStatusBuilder().withConditions(conditions).build());
        client.pods().inNamespace(component.getNamespace()).resource(pod).patchStatus();
      }

@shawkins
Copy link
Contributor

This code is in an operator which is responsible for creating the pods, so I am creating the pod resource, yes.

You are a controller then, but you are not a pod controller - that is the responsibility of kubernetes. It is not expected that you should set the status of a pod. Why are you trying to do this?

I understand the asynchronous nature of what I'm doing, but the error was very consistent and the time window small, so I came to assume it was me.

A freshly created pod will go through a series of status updates as it goes through initialization.

Would this be more correct in terms of updating the status?

Yes - but not for pods, kubernetes will simply overwrite whatever you put onto it. While you may be the owner, you are not the controller.

When you have a resource with complete metadata, you can also just do:

client.resource(resource).patchStatus();

@zenbones
Copy link
Author

I am attempting to use the pod readiness gate feature. I'm constructing the pod with the PodBuilder() using .withReadinessGates() and .withStatus(). That status is initially "False", at some point I need to update it to "True" and I'm really just trying to do that. I may have misinterpreted the Api and the setStatus() method on Pod. I apologize for not starting off with what I was trying to accomplish. What's the recommended procedure for that status update?

@shawkins
Copy link
Contributor

That status is initially "False", at some point I need to update it to "True" and I'm really just trying to do that.

My appologies - I was giving you general advice and you are very specifically trying to follow: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-status

That is an odd feature which seems to violate several design principles of status and is something I haven't seen anyone ask about wrt to the kubernetes client, so it wasn't something that jumped out from your initial question. The docs from https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/580-pod-readiness-gates/README.md imply the need to coordinate across other entities besides the pod / container themselves for readiness - is that the case you are dealing with?

Out of what I put into the previous comments, there are a couple of take aways. The first is that status is still a subresource - anything you set on it in a create or on regular update / patch will simply be thrown away. So you need to use patchStatus. Testing locally wiht an x readinessgate this works:

client.pods().withName("foo").editStatus(pod -> new PodBuilder(pod).editStatus()
        .addToConditions(new PodConditionBuilder().withStatus("True").withType("x").build()).endStatus().build());

while patch / edit does not (which is what the Kubernetes docs are saying as well).

And if you are seeing resourceVersion conflicts you either have to retry your operation or do the operation unlocked, for example without the unlock method:

client.pods().withName("foo").editStatus(pod -> new PodBuilder(pod).editMetadata().withResourceVersion(null).endMetadata().editStatus()
        .addToConditions(new PodConditionBuilder().withStatus("True").withType("x").build()).endStatus().build());

@zenbones
Copy link
Author

Thank you so much. I'll try editStatus() and be aware of the possibility I may need to retry or unlock.

To answer your question I have one set of pods that needs to wait for at least one of another set of pods to complete start up and respond to health check before completing its own start up. I have been using an init container that fails its own start up until this condition is met. I was thinking that readiness gates fit this use case well and would be a 'nicer' implementation of the same behavior. All of this, whether init container or readiness gate, is within our custom operator, written in Java (because I really dislike Go), and on top of your k8s client (and thank you for making this available).

@zenbones
Copy link
Author

I came up with this...

      client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).resources().forEach(
        podResource -> podResource.editStatus(pod -> new PodBuilder(pod).editStatus().addToConditions(ReadinessGateUtiltity.updatePodStatus(readinessMap)).endStatus().build())
      );

...which is obtuse. I could get the Pods, but instead I get PodResources so I can operate on the Pods calling editStatus(), which takes the Pod, from which I build a Copy of the Pod with new PodBuilder(pod), then edit its status by adding to conditions, when I really want to directly replace an existing condition.

This seems a little more direct. Would it work?

      client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).resources().forEach(
        podResource -> podResource.patchStatus().setStatus(new PodStatusBuilder().withConditions(...).build())
    );

@shawkins
Copy link
Contributor

shawkins commented Oct 1, 2023

...which is obtuse.

You mean that to edit the status you still have to work with the Pod? That's due to several factors - the primary is that there's no common base class for the various Status objects - PodStatus, DeploymentStatus, etc. are all independent classes. So rather than trying to come up with a marker interface or have a method expecting something an Object, the general support around status just works with the parent resource. Related to this there's no common setStatus method either (#3586 is also related) although that can be worked around using the generic representation.

A slightly more direct form of the lambda would be an enhancement to add a status form of the accept method:

p = client.pods().withName("foo").acceptStatus(pod -> pod.setStatus(new PodStatusBuilder(pod.getStatus())...));

Another possibility is to just generalize the subresource context. Under the covers we are simply tracking what subresource we're working with and putting opinionated methods on top of that. So resource.editStatus(...) is really like doing resource.subresource("status").edit(...). But we've been reluctant to expose direct manipulation of the subresource because there's a lot that's not possible that way, so the dsl would need several more interfaces to offer proper refinements. Then the other standard subresource, scale, has nuanced response types. We'd probably need to expose resource.status() and resource.scale() separately.

This seems a little more direct. Would it work?

You mean as written? No that won't work. Or are you asking for an enhancement that looks like the method PodResource.patchStatus(PodStatus status)? The dsl interface and implementation classes are not generated so anything that is type specific will need to be added by hand. Given this specific feature it makes sense to consider adding a method to just PodResource - but you could be even more opinionated: PodResource.setReadinessGatesStatus(Map<String, Boolean>).

One final thought here is around always adding the conditions - unless you are somehow guarenteeing this is happening only a single time, you'll want to test what happens if a status condition is repeated. While status conditions are supposed to function as a map (keyed by type), the schema is a list so that's the functionality that is offered by our generated object model / builders.

@zenbones
Copy link
Author

zenbones commented Oct 3, 2023

Yes, I was asking for some more direct approach vector in the API, and the suggestions you returned with are as valid as any I made.

Always adding, as you noted, could be bad. Possibly awful. Yes, the conditions are supposed to function like a map. Is it fabric8's schema that's a list, or the k8s rest api schema? I assume the latter so there's nothing you can do about it. I'm going to need to replace rather than add. I don't mind replacing the status as a whole if that's possible. My current rendition of the methodology you said world work (as opposed to any suggestion I've made), is...

      client.pods().inNamespace(component.getNamespace()).withLabels(Labels.matchLabels(component)).resources().forEach(
        podResource -> podResource.editStatus(pod -> new PodBuilder(pod).editStatus().addToConditions(...).endStatus().build())
      );

I'll ask for a quick check of the above that it's an accurate interpretation. If so, would...

podResource -> podResource.editStatus(pod -> new PodBuilder(pod).withStatus(...).build())

...or...

podResource -> podResource.editStatus(pod -> new PodBuilder(pod).editStatus().withConditions(...).endStatus().build())

...result in a wholesale replacement of conditions or status and conditions? I need to avoid simply adding to a list and guessing how that gets interpreted.

@shawkins
Copy link
Contributor

shawkins commented Oct 3, 2023

Is it fabric8's schema that's a list, or the k8s rest api schema?

It's part of the kubernetes schema: https://github.com/kubernetes/api/blob/master/core/v1/types.go#L4242

I'm not sure if by design they wanted an ordered map, or if they started with an array and decided it should act more like a map / object. In any case there are several places in the kubernetes api that act like this and are thereful painful to work with in the java object form.

I assume the latter so there's nothing you can do about it.

Notice the additional metadata about type acting as a key in the go code. At some point in the future we may be able to make more opinionated assumption about the handling based upon that.

result in a wholesale replacement of conditions or status and conditions? I need to avoid simply adding to a list and guessing how that gets interpreted.

It is possible that everything works well if you just set the conditions you are interested in - in a simple test the returned pod did have the rest of the expected status populated.

The verbose, but fully safe method for generating the patch:

Pod updateReadinessGateStatus(Pod pod, Map<String, Boolean> readiness) {
    Map<String, PodCondition> conditions = new LinkedHashMap<>();
    if (pod.getStatus() == null) {
      pod.setStatus(new PodStatus());
    }
    pod.getStatus().getConditions().forEach(pc -> conditions.put(pc.getType(), pc));
    for (Map.Entry<String, Boolean> entry : readiness.entrySet()) {
      if (entry.getValue() == null) { // effectively false, may not even need to account for this case
        conditions.remove(entry.getKey());
      } else {
        PodCondition condition = conditions.get(entry.getKey());
        String valueString = entry.getValue() ? "True" : "False";
        if (condition == null) {
          conditions.put(entry.getKey(), new PodConditionBuilder().withStatus(valueString).withType(entry.getKey()).build());
        } else {
          condition.setStatus(valueString);
        }
      }
    }
    pod.getStatus().setConditions(conditions.values().stream().collect(Collectors.toList()));
    return pod;
  }

Which you would use as podResource.editStatus(pod -> updateReadinessGateStatus(pod, readiness))

Another thought is that this would be fairly concise to express directly as a merge patch, but there again a patchStatus method exposing the ability to directly supply the patch string isn't exposed.

@shawkins
Copy link
Contributor

shawkins commented Oct 3, 2023

Checking kubectl they do now support --subresource (seem like that should be mentioned in the readiness gates docs), so there is more of a case for also supporting something like resource.subresource("status") or resource.status().

@zenbones
Copy link
Author

zenbones commented Oct 3, 2023

Thank you for your help. I think that covers everything. Maybe some of this work will find its way into documentation. I assume there's other people with an interest somewhere. I'll integrate your latest 'safe' methodology and get around to testing it. Laying it our like that definitely helps me see how it's put together, which I would not have managed on my own from the current docs. I'm going to close this ticket and I'll open another if I encounter something unexpected in my implementation. Thank you so much.

@zenbones
Copy link
Author

The suggested code did the trick and is now working. Thanks!

@manusa manusa removed the Waiting on feedback Issues that require feedback from User/Other community members label Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants