-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Cannot load DeepSeek-Coder-V2-Instruct #8174
Comments
there is no problem to run other models like qwen2_q8_0 or mixtral-8x-7b and I already tried other qauntized variants of the same model with the same result |
Try a lower context size with |
yes, that let me get a bit further but not very far - extra log I got even at -c 512
|
Can you link where you downloaded this model? |
this particular variant was made by me using llama-quantize from the original model but there are other ready to be downloaded quantized variants on hugging face, e.g. here (which I actually tried) |
Can you run it with a debugger and see where the exception is being thrown? With a build with debug symbols. |
could you help me with the command line to achieve the build with debug symbols? |
|
I assume this should be enough:
|
This should have been fixed in #8160, try updating to master. |
fixed by #8160 and context size adjustment |
Can someone please explain why this implementation runs significantly slower compared to a dense model with same active parameter count? |
What happened?
I am trying to use a quantized (q2_k) version of DeepSeek-Coder-V2-Instruct and it fails to load model completly - the process was killed every time I tried to run it after some time
Name and Version
./llama-cli --version
version: 3253 (ab36791)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.5.0
What operating system are you seeing the problem on?
Mac
Relevant log output
The text was updated successfully, but these errors were encountered: