Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with CUDA image #363

Open
luissimoesneom opened this issue Nov 27, 2023 · 1 comment
Open

Error with CUDA image #363

luissimoesneom opened this issue Nov 27, 2023 · 1 comment

Comments

@luissimoesneom
Copy link

We have tried to create a new docker container starting by using the docker image that is using on the vLLM example given in this repo and we got the below error:

Errors occurred while bootstrapping the Model Deployment: Start Container Error: unable to start container: Error response from daemon: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.8, please update your driver to a newer version, or use an earlier cuda container: unknown

Used image:
FROM nvidia/cuda:11.8.0-base-ubuntu20.04 as base

What is exactly the issue? How is it supposed for the Model Deployment example using vLLM to work?

Thank you

@RodrigoDiasDeOliveira
Copy link

i thought it can helps..
To resolve this issue, you have two main options:

Update your NVIDIA driver: Install the latest NVIDIA driver on your host system that supports CUDA 11.8 or later.
Use an earlier CUDA container: If updating the driver is not possible, modify your Dockerfile to use an earlier CUDA version that's compatible with your current driver.
Steps to Resolve
Check your current NVIDIA driver version:
nvidia-smi
If updating the driver, visit the NVIDIA driver download page and install the latest version for your GPU.
If using an earlier CUDA version, modify your Dockerfile:
FROM nvidia/cuda:11.7.0-base-ubuntu20.04 as base
(Or an even earlier version if needed)
Ensure you have the NVIDIA Container Toolkit installed:
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Rebuild your Docker image and try running the container again.
Additional Troubleshooting
If you continue to face issues:

Check the compatibility matrix between CUDA versions and NVIDIA driver versions.
Verify that your GPU supports the CUDA version you're trying to use.
Ensure that the NVIDIA Container Toolkit is correctly installed and configured.
Try running a simple CUDA container to isolate whether the issue is specific to your vLLM setup or a general CUDA/Docker configuration problem:
docker run --gpus all nvidia/cuda:11.8.0-base-ubuntu20.04 nvidia-smi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants