-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Persistent Device Claims for KubeVirt #254
Comments
Hi Alice, nice!
This is a good reference to understand what is more or less expected. I'd add links to it, like k8s pvc and kubevirt user-guide In the end of the description, we could make a reference to kubernetes/enhancements#3063 to highlight the ongoing discussion about DRA.
I think this is very reasonable indeed! On How to start, I'd recommend this kubecon talk as it provides a lot of insight on how it works, how to implement it, etc. |
PDCs are part of DRA. If you have a look to DRA, you can see that you can define ClaimTemplates. The goal of the project is to have a POC where we can use this new API with one of the already supported device types. For example PCI devices for passthrough. As I mentioned in the description, you could use emulated NVMe devices. In this way, the users who want to use a PCI device could also create a PDC based on the new template. Is this clearer? |
Hi @alicefr, I'm a master's student at Georgia Tech and I'm interested in potentially joining KubeVirt on this particular project for GSoC 2024. I am building on the first potential solution for this issue:
I have been reading up on this bit of documentation that allows us to specify a CRD for our ResourceClaim template and store resource definitions and properties inside of a .yaml file for persistence. From my understanding, this CRD will allow us to cache the desired device configuration using relevant metadata. I also think that utilizing custom controllers will allow us to use the metadata within the CRD. Additionally, the potential use of ControllerRevision may also be helpful in speeding up the initialization process for devices, as this might allow us to embed and serialize/deserialize objects that contain their internal state. Please let me know if my understanding or logic is flawed somewhere. I would appreciate any feedback on this expanded potential solution. :) |
The CRD can allow you to model the device parameters.
Please, have a look to the DRA documentation. The goal is to implement a DRA driver. This, for sure, also uses kubernetes controllers to watch the customer resources,
Once you have a draft for your proposal you can share it with us and we can review your solution in more details |
I see that makes more sense. I watched through the KubeCon talk that @victortoso linked a few comments above, and have some form of basic understanding on what DRA is and how we can create a DRA Driver to start. My question is, they mention the use of CDIs, however after looking through the KubeVirt documentation on Host Devices, I noticed that KubeVirt uses VFIO for Mediated Devices. My question is, are CDIs something I should be looking into for this particular project, or should I strictly stick to understanding the VFIO interface on how to prepare a specific device for device assignment?
For drafting my proposal, should I directly add your email to the google document itself? |
CDI is another kind of interface but you should relay on dra. The goal of the project is to model a device and this should serve as an example for most complex ones.
As you prefer. We need to be able to comment in the doc |
Hi Alice!
I've gone ahead and attached you to my Google Doc for my draft proposal, it should be under my email: [email protected] Please let me know what you think. 😄 |
Reminder, don't forget to submit a proposal through GSoC by 2nd April - 18:00 UTC. |
Dear Mods, Is this still open to work ? or someone has been assigned on it ? |
The project deadline has already passed, you cannot unfortunately participate anymore |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale |
/lifecycle frozen |
By any chance , is this available to work upon ? |
No, this project is part of Google Summer of Code, the deadline is already over and and it is already assigned. |
Title: Persistent Device Claims for KubeVirt
Description
KubeVirt [1] is a Kubernetes extension to deploy Virtual Machines like pods and integrate with the Kubernetes ecosystem.
For handling host devices, KubeVirt depends on the Kubernetes device plugin framework [2]. It is used for scheduling, allocating, and attaching a desired device and resources to a running pod.
One of the limitations of this framework is the persistence of the device allocation when the pod isn’t running. This becomes especially problematic for devices that require a significant initialization time, such as FPGAs, or storage devices, such as NVMes or USD devices, where users may have saved data. Devices assigned to the same resource name might be randomly allocated without the possibility to identify a specific device within the set.
For KubeVirt, the device is released upon shutdown or restarting of the VM due to the deletion or recreation of the VM pod. Hence, when the VM is restarted it might get a different device assigned than the previous one.
Dynamic Resource Allocation [3] API provides a solution by introducing resource claims. The claims are independent from the pod lifetime, and they persist until the user deletes them. In this way, we are able to recognize the device that was previously assigned to a VM and preserve its state upon restarts.
Expected Outcome
The project goal is to design, develop and integrate Resource Claims in KubeVirt for host device allocation. As it is the case with PVCs nowadays, the user should be able to declare and assign a resource claim and/or template to a KubeVirt virtual machine.
To support this new Kubernetes API, the project must research and suggest ways to expand KubeVirt.
A successful project will implement a POC for an example device that is already available in Kubevirt infrastructure. The outcome of this project will provide a base for future integration of DRA, once the API reaches maturity.
Project requirements
Project size: 350 hours
Difficult: Hard
Required skills: Kubernetes knowledge and GoLang programming skills
Desirable skills: Virtualization
Mentors: Alice Frosi [email protected], Victor Toso de Carvalho [email protected], Luboslav Pivarc [email protected]
How and where to search help
First, try to check KubeVirt documentation [4], we cover many topics and you might already find some of the answers. If there is something unclear, feel free to open an issue and a PR. This is already a great start to getting in touch with the process.
For questions related to KubeVirt and not strictly to the GSoc program, try to use the slack channel [5] and the issues [6] as much as possible. Your question can be useful for other people, and the mentors might have a limited amount of time. It is also important to interact with the community as much as possible.
If something doesn't work, try to document the steps and how to reproduce the issue as clearly as possible. The more information you provide, the easiest is for us to help you. If you open an issue in KubeVirt, this already guides you with a template with the kind of information we generally need.
How to start
How to submit the proposal
The preferred way is to create a google doc and share it with the mentors (slack or email work). If for any reason, google doc doesn't work for you, please share your proposal by email. Early submissions have higher chances as they will be reviewed on multiple iterations and can be further improved.
What the proposal should contain
The design and your strategy for solving the challenge should be concisely explained in the proposal. Which components you anticipate touching and an example of an API are good starting points. The updates or APIs are merely a draft of what the candidate hopes to expand and change rather than being final. The details and possible issues can be discussed during the project with the mentors that can help to refine the proposal.
It is not necessary to provide an introduction to Kubernetes or KubeVirt; instead, candidates should demonstrate their familiarity with KubeVirt by describing in detail how they intend to approach the task.
Mentors may find it helpful to have a schematic drawing of the flows and examples to better grasp the solution. They will select a couple of good proposals at the end of the selection period and this will be followed by an interview with the candidate.
The proposal can have a free form or you can get inspired by the KubeVirt design proposals [14] and template [15]. However, it should contain a draft schedule of the project phases with some planned extra time to overcome eventual difficulties.
Links
[1] https://github.com/kubevirt/kubevirt
[2] https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/
[3] https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/
[4] https://github.com/kubevirt/kubevirt/tree/main/docs
[5] https://kubernetes.slack.com/archives/C0163DT0R8X
[6] https://github.com/kubevirt/kubevirt/issues
[7] https://github.com/kubevirt/kubevirt/blob/main/docs/getting-started.md
[8] https://github.com/kubevirt/kubevirt/issues?q=is%3Aopen+is%3Aissue+label%3Agood-first-issue
[9] https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#example-pod
[10] https://gist.github.com/alicefr/592591b18a99cf126dd82110d8fa74ea
[11] https://kubevirt.io/user-guide/virtual_machines/host-devices/
[12] https://github.com/kubevirt/kubevirtci
[13] https://kubevirt.io/user-guide/virtual_machines/host-devices/#nvme-pci-passthrough
[14] https://github.com/kubevirt/community/tree/main/design-proposals
[15] https://github.com/kubevirt/community/blob/main/design-proposals/proposal-template.md
Other good resources to check:
The text was updated successfully, but these errors were encountered: