examples/server: "New UI" chat becomes slower with each subsequent message #7944
Labels
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
stale
What happened?
when using examples/server's "New UI", parts of the chat history seem to be re-evaluated (skipping the KV cache?) on each new message from the user. this is not the case with
llama-cli
or examples/server in the old UI mode with default settings/prompt.this seems to be a common failure mode for third-party frontends to llama.cpp, maybe there is an issue with the API layer that is making this problem difficult for frontends to solve? #7185
Name and Version
version: 3151 (f8ec887)
built with cc (Debian 13.2.0-25) 13.2.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
The text was updated successfully, but these errors were encountered: