Releases · henk717/koboldcpp

04 Feb 11:30

henk717

ea4b131

This is a Vulkan Only build of the upcoming v1.57, please check on https://koboldai.org/cpp if v1.57 has already released.
If v1.57 has a formal release this build has no advantages for you.

Assets 3

25 Feb 13:43

henk717

v.1.59-Ofast

25b89bd

v1.59-Ofast Latest

Latest

v1.59 but the makefile is changed to be OFast, for comparative testing.

Assets 2

14 Dec 02:29

henk717

v1.52

329df7d

1.52 - Linux

v1.52

Placebo commit to maybe fix CI

Assets 4

09 Dec 21:31

henk717

1.51.1

d9a109b

1.51.1 - Linux Binary Test

This is a special test release for linux, for other builds check https://koboldai.org/cpp

Assets 3

12 Jul 17:03

henk717

1.35

bc92bd6

1.35

This repository is only used on special occation for compiled builds, get the latest from https://koboldai.org/cpp

Koboldcpp 1.35 build with sched_yield enabled and CUDA 11.4 for better GPU compatibiltiy
H2 update: (Still shows H on the version but newer than the henk_cuda from concedo's repository) Compiled in a VM for better dependency stability and CUDA 11.4 support.
H3 update: Same source code as the previous versions other than the version name change. Recompiled with a different psutil (from conda instead of pip) to make high priority work again.

Win7 build: Compiled without PrefetchVirtualMemory, normally Windows 7 is only supported on the Fallback backend. This is a limited edition build that has Windows 7 support on hopefully all backends (CUDA not tested) at the expense of the model loading speed.

Tools: Compilation of all the GGML conversion tools (make tools)

Assets 5

23 Mar 20:44

henk717

v1.0.3

1166fda

v1.0.3 - Windows

llamacpp-for-kobold-1.0.3

Applied the massive refactor from the parent repo. It was a huge pain but I managed to keep the old tokenizer untouched and retained full support for the original model formats.
Reduced default batch sizes greatly, as large batch sizes were causing bad output and high memory usage
Support dynamic context lengths sent from client.
TavernAI is working although I wouldn't recommend it, they spam the server with multiple requests of huge contexts so you're going to have a very painful time getting responses.

To use drag and drop a compatible quantized model for llamacpp on top of the exe.

Assets 3

21 Mar 20:53

henk717

v1.0.2-2048

19178fa

v1.0.2 - 2048 Context - Windows

The original release was limited to 512 tokens, this release changes this to 2048 and uses all cores available on the system.

Assets 3

21 Mar 19:19

henk717

v1.0.2

a1625c4

1.0.2 - Windows

Standalone LLaMAcpp server for KoboldAI, includes KoboldAI Lite.
To use drag and drop a compatible quantized model for llamacpp on top of the exe.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This repository is only used on special occation for compiled builds, get the latest from https://koboldai.org/cpp

Releases: henk717/koboldcpp

v1.57 - Vulkan Only Pre-release

v1.59-Ofast

1.52 - Linux

1.51.1 - Linux Binary Test

1.35

This repository is only used on special occation for compiled builds, get the latest from https://koboldai.org/cpp

v1.0.3 - Windows

v1.0.2 - 2048 Context - Windows

1.0.2 - Windows