Skip to content

Releases: YellowRoseCx/koboldcpp-rocm

KoboldCPP-v1.77.yr1-ROCm

06 Nov 18:57
Compare
Choose a tag to compare
  • Bring Speed Back

upstream llama.cpp introduced a change to calculate certain values in full 32 bit precision by default which introduced a major slow down for some users with AMD GPUs, this reverts that change until improvements are made

KoboldCPP-v1.77.yr0-ROCm

03 Nov 09:42
8d3449d
Compare
Choose a tag to compare
Update dependencies in cmake-rocm-windows.yml

KoboldCPP-v1.76.yr1-ROCm

14 Oct 21:41
Compare
Choose a tag to compare

KoboldCPP-v1.76.yr0-ROCm

13 Oct 17:44
f27e9a9
Compare
Choose a tag to compare

Upstream changes & rocBlas GPU file reconstruction to attempt fixing the issue some RX 7000 GPU users were experiencing

Oct/14/2024 2:36PM CST - This build may be broke for some users.
try https://github.com/YellowRoseCx/koboldcpp-rocm/releases/tag/v1.76.yr1-ROCm
I'm sorry for the inconvenience, please work with me as I try to solve any errors🙏

KoboldCPP-v1.75.2.yr1-ROCm

27 Sep 17:44
Compare
Choose a tag to compare
  • Recompiled with gfx906 support
  • disable mmq by default
  • update make_pyinstaller.sh (used to create a single linux executable)

KoboldCPP-v1.75.2.yr0-ROCm

23 Sep 18:56
42edf71
Compare
Choose a tag to compare
Update cmake-rocm-windows.yml remove openblas

KoboldCPP-v1.74.yr0-ROCm

01 Sep 23:26
Compare
Choose a tag to compare
Merge remote-tracking branch 'upstream/concedo'

v1.73.1.yr1-ROCm v6.2.0

24 Aug 04:57
d15e1fd
Compare
Choose a tag to compare

KoboldCPP-ROCm v1.73.yr1

image

KoboldCPP-ROCm v1.73.yr1 with rocBLAS from ROCm v6.2.0 (the latest, newer than official Windows version)

I built rocBLAS and the tensile library files for the following GPU architectures: gfx803;gfx900;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102 with the code from the ROCm 6.2.0 release

I was able to test out gfx1010 (5600xt) and gfx1030 (6800xt) and they both worked separately and together (have to use the Low VRAM setting for multi GPU it seems)

  • NEW: Added dual-stack (IPv6) network support. KoboldCpp now properly runs on IPv6 networks, the same instance can serve both IPv4 and IPv6 addresses automatically on the same port. This should also fix problems with resolving localhost on some systems. Please report any issues you face.
  • NEW: Pure CLI Mode - Added --prompt, allowing KoboldCpp to be used entirely from command-line alone. When running with --prompt, all other console outputs are suppressed, except for that prompt's response which is piped directly to stdout. You can control the output length with --promptlimit. These 2 flags can also be combined with --benchmark, allowing benchmarking with a custom prompt and returning the response. Note that this mode is only intended for quick testing and simple usage, no sampler settings will be configurable.
  • Changed the default benchmark prompt to prevent stack overflow on old bpe tokenizer.
  • Pre-filter to the top 5000 token candidates before sampling, this greatly improves sampling speed on models with massive vocab sizes with negligible response changes.
  • Moved chat completions adapter selection to Model Files tab.
  • Improve GPU layer estimation by accounting for in-use VRAM.
  • --multiuser now defaults to true. Set --multiuser 0 to disable it.
  • Updated Kobold Lite, multiple fixes and improvements
  • Merged fixes and improvements from upstream, including Minitron and MiniCPM features (note: there are some broken minitron models floating around - if stuck, try this one first!)

Hotfix 1.73.1 - Fixed DRY sampler broken, fixed sporadic streaming issues, added letterboxing mode for images in Lite. The previous v1.73 release was buggy, so you are strongly suggested to upgrade to this patch release.

To use minicpm:

To use, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Discussion: KoboldCPP-ROCm v1.73.1.yr1-ROCm v6.2.0 Discussion #64

rocBLAS 4.2.0 for ROCm 6.2.0 for Windows

24 Aug 01:04
Compare
Choose a tag to compare

GPU tensile library files for gfx803;gfx900;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102 and rocBLAS.dll built with ROCm 6.2.0 code

KoboldCPP-v1.72.yr0-ROCm

05 Aug 06:56
Compare
Choose a tag to compare
Merge remote-tracking branch 'upstream/concedo'