-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E Test for TFServing with GPUs #291
Comments
I have some cycles to work on this. |
…GPUs. * This is the first step to creating an E2E for the GPU serving kubeflow#291. * This deployment is suitable for testing that we can deploy the GPU container and not have it crash because of linking errors. * This caught a bug in the Dockerfile. * Fix the Docker file for the GPU image; we need to remove the symbolic links from /usr/local/nvidia to /usr/local/cuda * On GKE the device plugin will make drivers available at /usr/local/nvidia and we don't want this to override /usr/local/cuda Related to kubeflow#291
Bump to P1 since we want to have GPU serving in our 0.1 release. @lluunn How can we serve a model on GPUs and verify that GPUs were actually used? |
…GPUs. (#362) * This is the first step to creating an E2E for the GPU serving #291. * This deployment is suitable for testing that we can deploy the GPU container and not have it crash because of linking errors. * This caught a bug in the Dockerfile. * Fix the Docker file for the GPU image; we need to remove the symbolic links from /usr/local/nvidia to /usr/local/cuda * On GKE the device plugin will make drivers available at /usr/local/nvidia and we don't want this to override /usr/local/cuda Related to #291
https://stackoverflow.com/questions/42630762/how-to-verify-tensorflow-serving-is-using-gpus-on-a-gpu-instance |
@lluunn This is blocked on #292 and the changes I requested in #383 to pass in a list of parameters to set on the ksonnet component. Once those are fixed do you want to pick this up? I think the next step would be adding appropriate steps to our E2E workflow to run the test using GPUs just like we do with CPUs. |
I am changing the cluster to kubeflow-ci, which has GPU pool. |
@lluunn Any update on the E2E test? |
WIP #442 |
We need an E2E test for TF Serving with GPUs.
As part of this we should built it continuously with prow.
The text was updated successfully, but these errors were encountered: