-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: ggml_metal_init error: zero-length arrays are not permitted in C++ float4x4 lo[D16/NW4]; #10208
Comments
According to llama.cpp pull requests in Homebrew the problem started appearing in Homebrew/homebrew-core#196827, between tags b4034 and b4038 Diff: b4034...b4038 |
I can confirm that b4034 works, but b4036 throws the same error. |
Which narrows down the problematic diff to b4034...b4036 |
Commit a1eaf6a was from merging cc @ggerganov, any clues? |
@stefanb Should be fixed now. Let me know if the issue persists. |
Tnx @ggerganov, waiting for the next tag (>b4056) containing the fix. |
@ggerganov, thanks, seems to be fixed 🎉 in |
Can confirm, works on my M2 Air! Thank you so much! Still impressed how fast Metal is even for reasonably sized models on a rather low-end laptop. |
What happened?
Trying to run a llama-server on Apple Silicon M2 running Ventura. Same error either using the latest release or building from source. I'm trying to load Llama-3.2-3B-Instruct F16 from Meta. I created the gguf using
convert_hf_to_gguf.py
.Name and Version
From source
./llama-cli --version
version: 4048 (a71d81c)
built with Apple clang version 15.0.0 (clang-1500.1.0.2.5) for arm64-apple-darwin22.6.0
From the release
$ ./llama-cli --version
version: 4044 (97404c4)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.6.0
What operating system are you seeing the problem on?
Mac
Relevant log output
The text was updated successfully, but these errors were encountered: