-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building llama.cpp or building libllama.so on a virtualized Linux on Apple silicon does not work. #2344
Comments
I'm also encountering this issue when trying to dockerize a project that has llama-cpp-python as dependency. Downloading and using it locally worked like a charm, but when trying to build the docker container, it failed with this same error. |
I found a solution for using llama.cpp in Apple Silicon Linux VMs (and probably also Docker on Apple Silicon) without changing anything!!! Just build with the following command for Apple Silicon Linux VMs: |
@AndreasKunar but doesn't your fix disable SIMD on ARM? Reducing performance? |
@redthing1 - my understanding is, that it just force-disables the Metal-framework/MPS support, which is not available in VMs or Docker for Apple Silicon anyway. In my understanding the obligatory Apple Virtualization Framework for Apple Silicon which has to get used there by VMs/Docker only offers a limited Apple Silicon hardware functionality subset. I think the CPU-code generated by these switches still uses the available CPU SIMD instructions. At least in my tests, it had similar token/s performance in VMs than running it without GPU-enablement in a host macOS. I'm disappointed, that VMs/Docker only seems to be able to support GPU-acceleration via CUDA and probably Nvidia/AMD via CLBlast+OpenCL for x64 CPUs. I'm still hopeful, that Apple someday might extend their obligatory Virtualization Framework technology to cover their two key shortcomings: a) missing GPU/NPU support (even if it's only via intermediaries like OpenCL, etc.) and b) missing nested hypervisor support (which is somewhat crippling Linux/Windows in VMs). But they did neither when going from M1 to M2 or in macOS Sonoma. |
Thank you for the explanation. I hope so as well. |
I encountered the same issue when switching my Docker image from Debian Bullseye to Bookworm.
The problem arises from the incorrect definition of the UNAME_M environment variable as 'aarch64,' which will cause build failures on Apple Silicon. ENV UNAME_M=arm64 @AndreasKunar please confirm that UNAME_M=arm64 make is enough to make it compile correctly |
Sorry, I‘m no expert in the llama.cpp makefile design. I would also set UNAME_P=arm LLAMA_NO_METAL=1 to be sure. At least this worked for me. |
Good find, thanks for sharing. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I'm using a MacBook Air M2 24GB/1TB with Ubuntu 23.04 in a Parallels VM
This is also an issue for downstream llama-cpp-python, which uses/builds libllama.so
The compiler flag "-mcpu=native" seems to be the culprit, generating inlining errors. Without it, it builds, and w/o any source code changes. But the build produces warnings - not quite sure about the "missing braces around initializer", the "unused variable" probably can be ignored. The code executes and works for me, but I could not fully test it
A good fix is a bit difficult, since the M1,... in a VM is not easily detectable. The difference I found was in "uname -p" which results in "unknown" on RapberryOS 64 and in "aarch64" on virtual Ubuntu on an M2.
An insert/edit of "makefile" starting in line 262 of the makefile (right after "# Apple M1, M2, etc. \n # Raspberry Pi 3, 4, Zero 2 (64-bit)") works for me:$(filter aarch64%,$ (UNAME_P)),)
ifeq (
# do not set for Apple running in VMs
CFLAGS += -mcpu=native
CXXFLAGS += -mcpu=native
endif
This is a simmilar problem as closed issue #1655
I'm new to github and don't know how to write pull requests, also I don't know if the fix does not produce undesired side-effects on non Raspbian Linux on Pis.
The text was updated successfully, but these errors were encountered: