-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support on GKE not available #1246
Comments
Others can correct me if I'm wrong, I think you need to have the DaemonSet installed in your cluster for Nvidia drivers.
Source: https://cloud.google.com/kubernetes-engine/docs/concepts/gpus |
@swiftdiaries unfortunately not. It is in deploy.sh and to make sure i just tried it manually, no success. |
It doesn't look like you requested GPUs for your notebook
In the JupyterHub spawner did you supply extra resource limits e.g
|
The docs are a bit buried here |
* Fix Trial parameter in darts example * Fix description
* update kfctl_ibm kfdef to kustomize v3 * small update to README * update to use katib, minio and mysql generic * update after platform test * fix test failure
KUBEFLOW_VERSION=0.2.2
After setting up a pretty default cluster on GKE with "getting-started-gke" deploy.sh with a
gpu-pool-initialNodeCount: 1
i could not spawn gpu images on jupyterhub. Removing the taint on the gpu node withkubectl taint nodes gke-hub-gpu-pool-9d1db964-9gqn nvidia.com/gpu:NoSchedule-
allows me to spawn the imagegcr.io/kubeflow-images-public/tensorflow-1.8.0-notebook-gpu:v0.2.1
.Now i created a jupyter notebook and executed the following:
but i get this error:
more information:
Spawning image
gcr.io/kubeflow-images-public/tensorflow-1.8.0-notebook-gpu:v0.2.1
via JupyterHub.Removing the taint
then of course i can still not use the GPU because the container image does not seem to have the right drivers for cloud.google.com/gke-accelerator=nvidia-tesla-k80
any help appreciated.
The text was updated successfully, but these errors were encountered: