Replies: 2 comments 3 replies
-
Hey @andife, great to see you back! Hope you're doing well. The support for fractional GPU usage in Flyte is not explicitly defined by the GPU models but rather depends on the configuration of your Kubernetes cluster and the NVIDIA device plugin for Kubernetes. The Flyte codebase references a GPU resource name as Fractional GPU usage in Kubernetes, and by extension in Flyte when configured to use Kubernetes for task execution, typically relies on the support provided by the NVIDIA device plugin for Kubernetes. This plugin allows Kubernetes to schedule GPU resources, but the support for fractional GPU usage (e.g., requesting a portion of a GPU's capabilities) would depend on specific features of the NVIDIA device plugin and the underlying hardware. If a container exceeds the specified GPU memory during a process, the behavior would depend on how the NVIDIA device plugin and the Kubernetes cluster are configured to handle resource limits and requests. Kubernetes does not enforce GPU memory limits in the same way it does for CPU and RAM. If a process tries to use more GPU memory than is available, it could lead to allocation errors from the CUDA runtime or other GPU-related libraries, potentially causing the process to fail. Kubernetes itself does not monitor or limit GPU memory usage directly; such enforcement would need to be handled at the application level or by using specific GPU monitoring tools. In terms of specific GPU models like GTX6000 or TRX2080TI, the support would again depend on the NVIDIA device plugin for Kubernetes and the specific features of these GPU models. The Flyte codebase does not provide explicit details on supported GPU models. For more details, you can refer to the following sources:
I hope this helps clarify your questions. If you have further queries, feel free to ask. Sources
|
Beta Was this translation helpful? Give feedback.
-
@andife this is great question and probably needs clarification. Flyte only supports what nvidia device plugin supports, which is MIG based gpu partitioning system. This is memory safe, but is only available for certain gpu devices, IIRC, starting from the A100 family. Are you asking if you need to use accelerators for the GTX devices? or do you want to fractionalize them. I do not think the gTX family supports MIG. On the other hand if you want to use the accelerator tag to target to the GTX devices, this absolutely can be done, but will need adding these new constants and providing the relevant configuration to propeller. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I wonder which GPU models are supported by NVIDIA regarding "fractional GPU" (At https://docs.flyte.org/en/latest/api/flytekit/extras.accelerators.html#predefined-accelerator-constants only a few NVIDIA GPUs are listed? ). Or what does fractional GPU mean in this context? Can't the container consume more GPU memory than specified in the config? Will or could the process be killed if it starts to need more during the process to ensure that it does not jeopardize the other processes on the system?
What determines which GPU models can be used? What about the GTX6000 or a TRX2080TI, for example?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions