-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD #5240
Conversation
|
Co-authored-by: slaren <[email protected]>
I assume this is to be able to tell this value at run time, which is useful for example with dynamic linking. In this case, shouldn't |
Yes, my main motivation is to reduce GPU-related conditionals in the header files. Nothing specific in mind, just has to be better like this |
There is still one |
If there any downstream applications that depend on these defines to enable GPU offloading, they may break after this change, so it may be a good idea to put a notice. |
36ea36c
to
3cedb7e
Compare
…ganov#5240) * llama : remove LLAMA_MAX_DEVICES from llama.h ggml-ci * Update llama.cpp Co-authored-by: slaren <[email protected]> * server : remove LLAMA_MAX_DEVICES ggml-ci * llama : remove LLAMA_SUPPORTS_GPU_OFFLOAD ggml-ci * train : remove LLAMA_SUPPORTS_GPU_OFFLOAD * readme : add deprecation notice * readme : change deprecation notice to "remove" and fix url * llama : remove gpu includes from llama.h ggml-ci --------- Co-authored-by: slaren <[email protected]>
Hi, in latest lamma-cpp-python releas (0.2.39) I get this error: |
Getting same error |
It is probably related to this change. |
llama-cpp-python is calling llama_max_devices() though - after all, it is written in python and gets this value at runtime. I tested it for myself with a CUBLAS build, and get llama_cpp.LLAMA_MAX_DEVICES = 16. |
It's a weird that there error says |
I get |
The current version of llama.cpp will always print the |
Only way I can get it to launch is to roll back llama-cpp-python to v0.2.37 |
@slaren @cebtenzzre I have the same issue with the latest llama-cpp-python, is there any workaround I can bypass the LLAMA_MAX_DEVICES=0 issue? |
…ganov#5240) * llama : remove LLAMA_MAX_DEVICES from llama.h ggml-ci * Update llama.cpp Co-authored-by: slaren <[email protected]> * server : remove LLAMA_MAX_DEVICES ggml-ci * llama : remove LLAMA_SUPPORTS_GPU_OFFLOAD ggml-ci * train : remove LLAMA_SUPPORTS_GPU_OFFLOAD * readme : add deprecation notice * readme : change deprecation notice to "remove" and fix url * llama : remove gpu includes from llama.h ggml-ci --------- Co-authored-by: slaren <[email protected]>
Instead of the defines, use the functions:
llama_max_devices()
llama_supports_gpu_offload()