Releases: chottolabs/kznllm.nvim
v0.2.3
Major QOL improvements for actually debugging + contributing to the plugin, huge thanks to @dceluis
At a high-level:
- stronger reference to default template path (avoids differences in user config - see #6 (comment))
- templates are less error-prone and much easier to read
- jobs throw errors instead of silently crashing the neovim entirely
- added custom deepseek template that respects prompt caching strategy
- various bug fixes
What's Changed
- Enable jinja whitespace control by @dceluis in #18
- fix off by one error when setting extmark by @chottolabs in #20
- Fix anthropic template not being passed prompt args by @dceluis in #22
- Improve curl error handling by @dceluis in #23
- Add default TEMPLATE_DIRECTORY by @dceluis in #24
- throw minijinja error by @chottolabs in #26
- Add Deepseek templates for prompt caching by @chottolabs in #25
New Contributors
Full Changelog: v0.2.2...v0.2.3
v0.2.2
- feat/fix: adding deepseek model by @makyinmars in #16
- Implement prefill for some models by @chottolabs in #17
New Contributors
- @makyinmars made their first contribution in #16
Full Changelog: v0.2.1...v0.2.2
Note:
At this point, only anthropic, deepseek, vllm have working prefill implementations - however I have 0 credits for them so I will have fun with deepseek 🥲
The other thing is all their implementations are different
- anthropic just automatically assumes if your last message is from assistant, then it's prefill
- groq expects stop but doesn't seem to support prefill at all (in my testing, it tries to start with
```
and immediately terminates) - lambda/vllm expects stop_token_ids (not even sure if it works properly, i think it's just the
add_generation_prompt
condition in the jinja the template) - deepseek expects
stop = ```
, but also requires you to addprefix = true
to the final assistant message
v0.2.2rc1
wow i'm glad i re-architected it, otherwise i would've just littered it with conditional statements 💀
all the apis do it differently
- anthropic just automatically assumes last message assistant == prefill
- groq expects
stop
but doesn't seem to support prefill at all (in my testing, it tries to start with```
and immediately terminates) - lambda/vllm expects
stop_token_ids
(not even sure if it works properly, i think it's just theadd_generation_prompt
condition from the template) - deepseek expects
stop
, but also requires you to addprefix = true
to the assistant message
What's Changed
- feat/fix: adding deepseek model and dressing as dependency by @makyinmars in #16
- implement prefix caching deepseek (5e67b9f)
New Contributors
- @makyinmars made their first contribution in #16
Full Changelog: v0.2.1...v0.2.2rc1
v0.2.1
I'm playing with local models now and it was getting out of hand to deal with the subtle differences in parameters + features that I wanted to support, so here's another big breaking change.
kznllm_demo_v2.mp4
What's Changed
breaking changes - copy the suggested config from the README
subtle bugs related to unexpected behaviors with context have been patched
Related: #14
- fixes long-standing bug where you can't trigger generate after an undo (this exists in dingllm too - see change e20d743)
- all pre-existing kznllm logic is now built into
kznllm.presets
because i'll be building other types of features/workflows based off of the core library (which just provides convenient functions that i can stitch together) - instead of restricting the context builder to fixed messages, it accepts a generic
make_data_fn
that the user can implement to build up the data portion of the API call with their own settings/parameters (see CONTRIBUTING) - adds a suggested model switcher under presets
Full Changelog: v0.1.0...v0.2.1
v0.2.0rc1
I'm playing with local models now and it was getting out of hand to deal with the subtle differences in parameters + features that I wanted to support, so here's another big breaking change.
The key change was to make it as easy as possible to retrieve data, pipe it straight into an API, and send the outputs somewhere in your nvim buffer without shooting yourself in the foot with the quirks of nvim.
- want a new feature? write your own variation of
invoke_llm
- want better api features? add a new
preset
with custom options - want better prompts? write your own templates (or pipe some more data into a custom
make_data_fn
)
Now the functionality from v0.1.0 is actually just a user implementation under lua/kznllm/presets.lua
With the new architecture, it was much easier to do a clean(er) model/preset switcher which will come in handy while I'm doing experiments. Also groq queue times can get up to 15s long (???)
after rewriting this plugin literally 5 times by now, it's now very clear that model + prompts + api spec + feature set should be very tightly coupled together. what I actually want for myself is have the flexibility to implement the whole stack of sub-components from scratch and package it together in a preset.
With this refactor, it's relatively easy to add entirely new features without touching the core of the plugin except all of them will just be treated as user implementation details (i.e. prompt caching, custom params, or fast + slow retrieval)
TODO
- completion model prompt template is really bad and buggy rn, use it with caution
Full Changelog: v0.1.0...v0.2.0
v0.1.0
What's Changed
- REWRITE by @chottolabs in #10
If I have nvim
open, I'm planning to write natty code. We're back to a single core feature w/ some QOL improvements.
From my previous notes (see https://github.com/chottolabs/kznllm.nvim/releases/tag/v0.0.1) :
I played with some ideas in terms of buffer mode and project mode - but ultimately they're too slow + clunky. It should probably just be a single key like leader + k that asks for a short prompt and either (a) replaces selection w/ working code fragment (b) fully implements code when there is no selection.
interface wise, I think this is pretty close to end game - everything else I would consider configuration (e.g. project-scoped context, evals, etc.).
If you are generating a whole project, then I think you need a new kind of interface like where cursor or zed are going (still not there yet). nvim
being primarily text based is probably fundamentally limited when you are trying to achieve high information density.
Full Changelog: v0.0.1...v0.1.0
REWRITE
rewriting everything, only starting with replace mode on this branch
I played with some ideas in terms of buffer mode and project mode - but ultimately they're too slow + clunky. It should probably just be a single key like leader + k
that asks for a short prompt and either (a) replaces selection w/ working code fragment (b) fully implements code when there is no selection.
Any other feature doesn't belong inside of nvim IMO, it just makes things too messy for no reason. Everything else is configuration or a supplementary tool to help manage a search index, pull up prompt history, etc. but it's not core to the plugin itself.
I just want everything to work out of the box and not stray too far from the natty neovim experience
I tested out cursor for a while and I felt like the implementation of diff visualization is mostly a gimmick that adds a lot of extra complexity (i.e. it doesn't just diff the whole selection, it's selectively changing blocks of lines). A simple plugin would just replace the visual selection and let the user spam undo/redo... this way I get most of the benefit without clogging up the screen at all.
It might be useful if it's some massive multi-file refactor, but i don't think anyone has figured out a usable interface for that yet either.
some features i'm targeting:
- prompts, templates, and search heuristics should be transparent to me as a user
- prunable RAG for examples/docs/snippets (i.e. keep around enough data to tune ranking)
- tools for evals + managing context
EDIT: hang on wtf is cursor composer holy shit lmao