Replies: 4 comments 10 replies
-
What does the "☁️" mean? |
Beta Was this translation helpful? Give feedback.
-
So "Parallel decoding" is done by |
Beta Was this translation helpful? Give feedback.
-
Should beam search be added here? I think it is broken atm, at least with CUDA. |
Beta Was this translation helpful? Give feedback.
-
What would be criteria for considering OpenCL back-end working correctly? I've fixed all known bugs in ggml-opencl.cpp and now working on refactoring like #3669. |
Beta Was this translation helpful? Give feedback.
-
[NO LONGER UPDATED]
Below is a summary of the functionality provided by the
llama.cpp
project.Legend (feel free to update):
✅ - Working correctly
☁️ - Partially working
❌ - Failing
❓ - Status unknown (needs testing)
🔬 - Under investigation
🚧 - Currently in development
main
,simple
batched
parallel
speculative
speculative
speculative
lookahead
infill
server
embedding
main
main
main
main
main
main
main
main
main
main
main
,server
beam-search
main
test-tokenizer-0-llama
test-tokenizer-0-falcon
main
main
main
main
main
main
main
llava
main
main
main
finetune
finetune
ggml
ggml
ggml-cuda
ggml-cuda
ggml-metal
ggml-opencl
ggml-vulkan
Beta Was this translation helpful? Give feedback.
All reactions