-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate Kubernetes Watch
and Stream
calls at the Pod-level over Container-level.
#519
Comments
Watch
and Stream
calls at the Pod-level instead of the Container-level.Watch
and Stream
calls at the Pod-level over Container-level.
go-vela/worker#303 is not panning out for fixing log streaming, but several other issues have been (or will soon be) fixed. And, the log stream has to be per-container, so managing the log streams at the pod level has not been helpful so far. For now, I'm going to step away from the log streaming part of this and see how the rest of the fixes affect things over time. If I notice any more logging issues, I will revisit this. Here are some of the logging changes that will hopefully improve things:
However, go-vela/worker#302 has been working out very well. I'm very happy with watching the pod at the pod-level (instead of per container) using the k8s-controller primitives ( So far, go-vela/worker#302 allows us to watch for Pod changes (add/update/delete) for the pod itself. To get information about pull errors, however, we need to start watching the stream of events. These events do not affect the fields of the pod, so we need a separate stream. In go-vela/worker#279 I added plain watches for the kubernetes events. I will create a new PR that re-implements (reuses?) that, possibly via the SharedInformer on the PodTracker. |
go-vela/worker#390 massively improves log streaming reliability by allowing the streaming to take longer than the execution using a configurable timer. I have one more WIP fix that will make sure the pod doesn't get deleted until streaming is "done" (finished or timed out). |
@cognifloyd just checking in on some older issues. what do you think about this? still an issue? something we should prioritize for k8s runtime? |
Description
Consolidate Kubernetes
Watch
andStream
calls at the Pod-level instead of the Container-level.There is prior work we can use to inform the implementation as well:
stern
:If we do something similar, then we could:
runtime.AssembleBuild()
, we start streaming pod changes + events (similar to howWaitContainer
currently watches for pod changes, but only one stream of changes instead of one per step).runtime.AssembleBuild()
also starts a goroutine that consumes the stream of events to do a number of things:runtime.TailContainer
can access once Vela catches up recording state in the database)runtime.*Container
methods to access.RunContainer
can wait until image pull is successful before returning or return any image pull errors identified in the container's event streamWaitContainer
can wait for the events to say that the container has completed.Value
Improve the reliability of the Kubernetes runtime, especially:
Hopefully, starting the container log stream as quickly as possible will mean that a log stream starts for each container before we modify it to add the step image (ie start streaming while the step has the
kubernetes/pause:latest
image instead of the step image. The TailContainer can start reading from the stream for a given step once it's ready to do so.Another benefit is reducing the Vela-generated load on Kubernetes API servers.
The Vela Worker is effectively a "Kubernetes Controller" (eg the software that watches for ReplicaSet changes to create Pods) because it watches for events (albeit not sourced from Kubernetes resource changes) and converts those events into Pods, managing the lifecycle of those Pods. The libraries involved in building such a controller go to great lengths to avoid polling the Kubernetes API servers; they watch and then locally cache Kubernetes resources, allowing Kubernetes to push changes through the watch. Similar to a Kubernetes Controller, the Vela Worker's Kubernetes Runtime should minimize the API calls required to manage its Pipeline Pods (and any other resources it needs to retrieve in the course of creating the Pod).
Definition of Done
stern
) and compare the captured logs with what Vela shows in its UI or CLI. There should be no difference in the log contentsEffort (Optional)
Good question. I don't know, but I need it to work, so I'll be working on this.
Previous attempts at fixing these issues include:
Impacted Personas (Optional)
Anyone who uses the Kubernetes Runtime.
Depending on the implementation, some API changes may be needed between Executor and Runtime interfaces (which is much easier to coordinate now that Executor and Runtime packages are both in the Worker repo).
The text was updated successfully, but these errors were encountered: