Skip to content

2.2.4 Backend: TabbyAPI

av edited this page Oct 27, 2024 · 4 revisions

Handle: tabbyapi URL: http://localhost:33931

Python 3.10, 3.11, and 3.12 License: AGPL v3 Discord Server

Developer facing API documentation

Support on Ko-Fi

An OAI compatible exllamav2 API that's both lightweight and fast

Models

  • Supports same set of models as exllamav2
HuggingFaceDownloader

Harbor integrates with the HuggingFaceDownloader CLI which can be used to download models for the TabbyAPI service.

# [Optional] lookup models on the HF Hub
harbor hf find exl2

# [Optional] If pulling from the closed or gated repo
# Pre-configure the HF access token
harbor hf token <your-token>

# 1. Download the desired model, use "user/repo" specifier
# Note the "./hf" directory set as the download location - this is
# where the HuggingFace cache is mounted for downloader CLI
harbor hf dl -m Annuvin/gemma-2-2b-it-abliterated-4.0bpw-exl2 -s ./hf
harbor hf dl -m bartowski/Phi-3.1-mini-4k-instruct-exl2 -s ./hf -b 8_0

# 2. Set the model to run
# Use the same specifier as for the downloader
harbor tabbyapi model Annuvin/gemma-2-2b-it-abliterated-4.0bpw-exl2
harbor tabbyapi model bartowski/Phi-3.1-mini-4k-instruct-exl2

# 3. Start the service
harbor up tabbyapi
Native HF CLI
# Download with a model specifier
harbor hf download ChenMnZ/Mistral-Large-Instruct-2407-EfficientQAT-w2g64-GPTQ
# With a specific revision
harbor hf download turboderp/Llama-3.1-8B-Instruct-exl2 --revision 6.0bpw

# Grab actual name for the folder
harbor find ChenMnZ

# Set the model to run
harbor config set tabbyapi.model.specifier /hub/models--ChenMnZ--Mistral-Large-Instruct-2407-EfficientQAT-w2g64-GPTQ/snapshots/f46105941fa36d2663f77f11840c2f49a69d6681/

Starting

TabbyAPI exposes an OpenAI-compatible API and can be used with related services directly.

# [Optional] Pull the tabbyapi images
harbor pull tabbyapi

# Start the service
harbor up tabbyapi

# [Optional] Set additional arguments
harbor tabbyapi args --log-prompt true

# See TabbyAPI docs
harbor tabbyapi docs

Configuration

Harbor will mount a few volumes for the TabbyAPI container:

  • Host HuggingFace cache - /models/hf
  • llama.cpp cache - /models/llama.cpp
Clone this wiki locally