-
Notifications
You must be signed in to change notification settings - Fork 969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run with -ngl
parameter?
#268
Comments
With a 7B model and an 8K context I can fit all the layers on the GPU in 6GB of VRAM. Similarly, the 13B model will fit in 11GB of VRAM:
|
I get this error when setting that parameter:
In particular it hangs on line 259 of |
It looks like your model file is corrupt. Does it work with |
Yes, it does. Could you please give me any hint on how to debug the python binding better? |
That's really strange. The error DebuggingWithGdb maybe? |
I think this can be an error due to the encoding of the file, because I've tried to download a pre-quantized model from https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main and running it in Docker, but I'm getting segmentation fault there as well:
But this is strange, I did follow the steps at the link I've sent on the first post in the issue and for llama.cpp they work just fine, I don't know how it can be that they don't work for the python binding. How did you prepare the data? |
Try cloning
|
Thank you a lot, that was the issue: I was quantizing using a different version of llama.cpp. |
I have struggled with the same trouble last week. It is caused by ggerganov/llama.cpp#1305 breaking change of lamma.cpp rolled out recently, right? |
Is your feature request related to a problem? Please describe.
I have a low VRAM GPU and would like to execute the python binding. I can run LLaMA, thanks to https://gist.github.com/rain-1/8cc12b4b334052a21af8029aa9c4fafc . But I didn't understand if this is possible with this binding.
Describe the solution you'd like
I want to run 13B model on my 3060.
Describe alternatives you've considered
https://gist.github.com/rain-1/8cc12b4b334052a21af8029aa9c4fafc
Additional context
The text was updated successfully, but these errors were encountered: