-
Notifications
You must be signed in to change notification settings - Fork 961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When I try to load the 'qwen' gguf file, it shows the error message: "error loading model: unknown model architecture: 'qwen'". Does this mean that the 'qwen' model is not supported? #992
Comments
+1 |
llama-cpp supported last week: ggerganov/llama.cpp#4281 |
@dgo2dance sorry been away the past week but just had a chance to push this update, let me know if it works! |
Not sure if this is the same issue, but when i tried to load the qwen model (either f16 or q8_0) using Ipython notebook in VSCode, the kernel dies. llama_model = Llama(model_path = path, n_gpu_layers=-1, n_ctx=0) There is not many error in logs: Command line works fine: Output: |
the issue seems to be from n_ctx parameter, if I set it to 0, then initializing qwen model crashed; initializing the original llama2 7b is ok but will throw ValueError at create_completion. seems to be n_batch? if i did not set n_ctx then both load and complete fine. File ~/anaconda3/envs/hfhub/lib/python3.11/site-packages/llama_cpp/llama.py:1057, in Llama.eval(self, tokens) ValueError: range() arg 3 must not be zero |
Closing this as I believe it was just related to the model not being supported in llama-cpp-python yet. I'll reopen if anyone is still experiencing this issue with the latest version. |
When I try to load the 'qwen' gguf file, it shows the error message: "error loading model: unknown model architecture: 'qwen'". Does this mean that the 'qwen' model is not supported?
The text was updated successfully, but these errors were encountered: