Name		Name	Last commit message	Last commit date
parent directory ..
engine		engine
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
build.sh		build.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

README.md

Getting Started

Setup development env

Configuration & Install

Configuration

The default source configuration for torch in pyproject.toml:

torch = "^2.4.1"

If you want to explicitly use a CPU, GPU, or ROCM, remove line torch = "^2.4.1" and add one of these lines below:

CPU

#PYTORCH
torch = { version = "^2.4.1", source = "pytorch-src" }

[[tool.poetry.source]]
name = "pytorch-src"
url = "https://download.pytorch.org/whl/cpu"
priority = "explicit"

NVIDIA

CUDA118

#PYTORCH
torch = { version = "^2.4.1", source = "pytorch-src" }

[[tool.poetry.source]]
name = "pytorch-src"
url = "https://download.pytorch.org/whl/cu118"
priority = "explicit"

CUDA121: Use default
CUDA124

#PYTORCH
torch = { version = "^2.4.1", source = "pytorch-src" }

[[tool.poetry.source]]
name = "pytorch-src"
url = "https://download.pytorch.org/whl/cu124"
priority = "explicit"

AMD

Windows does not support AMD GPU with PyTorch, only config for Linux

#PYTORCH
torch = { version = "^2.4.1", source = "pytorch-src" }

[[tool.poetry.source]]
name = "pytorch-src"
url = "https://download.pytorch.org/whl/rocm6.1"
priority = "explicit"

Install libraries

poetry install

Run

poetry run python engine/server.py

Env configuration

Name	Default (Optional)	Note
AT_DEBUG_MODE	on	Turn on app debugger
AT_APP_DIR	tmp	Store database, vector, models
AT_AUDIO_CHUNK_RECOGNIZE_DURATION	30 ( seconds)	Time to chunk audio segments, use to detect language
AT_AUDIO_CHUNK_RECOGNIZE_THRESHOLD	120 ( seconds)	If audios is shorter than x seconds, do not chunk to detect language
AT_AUDIO_CHUNK_CHAPTER_DURATION	600 ( seconds)	Time to chunk audio segments, use to automatically split a long audio file
AT_LANGUAGE_PREFER_USAGE	en	Default subtitle language that will be chosen
AT_QUERY_SIMILAR_THRESHOLD	0.2	Default threshold to query similar documents for each question
AT_TOKEN_CONTEXT_THRESHOLD	2048	Default threshold to use whole transcript if context is not found
AT_AUDIO_ENHANCE_ENABLED	off	Using enhance audio process (experiment)
AT_RAG_QUERY_IMPLEMENTATION	multiquery	Select your RAG enhance query type ("multiquery", "fusion", "decomposition", "step_back", "hy_de")
AT_GEMINI_API_KEY	None	If you prefer using embedding and QA with Google
AT_OPENAI_API_KEY	None	If you want to use embedding and QA with OpenAI
AT_CLAUDE_API_KEY	None	Iff you want to use QA with Claude
AT_VOYAGEAI_API_KEY	None	If you want to use embedding with VoyageAI
AT_MISTRAL_API_KEY	None	If you want to use embedding and QA with Mistral
AT_GEMINI_EMBEDDING_MODEL	models/text-embedding-004	Prefer GEMINI model for embedding texts
AT_OPENAI_EMBEDDING_MODEL	text-embedding-ada-002	Prefer OpenAI model for embedding texts
AT_VOYAGEAI_EMBEDDING_MODEL	voyage-large-2	Prefer VoyageAI model for embedding texts
AT_MISTRAL_EMBEDDING_MODEL	mistral-embed	Prefer MistralAI model for embedding texts
AT_LOCAL_EMBEDDING_MODEL	intfloat/multilingual-e5-base	Prefer Local model for embedding texts
AT_LOCAL_EMBEDDING_DEVICE	auto	Provider device to embedding texts in local (prefer "mps", then "cuda", otherwise use "cpu")
AT_SPEECH_TO_TEXT_PROVIDER	local	Speech to text provider (local, openai, gemini)
AT_LOCAL_WHISPER_MODEL	base	Provider model to speech to text in local
AT_LOCAL_WHISPER_DEVICE	auto	Provider device to speech to text in local (prefer "cuda", otherwise use "cpu")
AT_LOCAL_OLLAMA_HOST	http://localhost:11434	Ollama host to connect
AT_LOCAL_OLLAMA_MODEL	qwen2	Ollama model to QA

Prefer ENV for running LOCAL

If your pc has Nvidia GPU, use "Recommendation" settings.

Name	Value	Recommendation	Note
AT_LOCAL_OLLAMA_HOST	http://localhost:11434	-	-
AT_LOCAL_OLLAMA_MODEL	qwen2	llama3.1	-
AT_LOCAL_EMBEDDING_MODEL	intfloat/multilingual-e5-base	intfloat/multilingual-e5-large	-
AT_LOCAL_EMBEDDING_DEVICE	cpu	gpu	-
AT_LOCAL_WHISPER_MODEL	base	large-v3	-
AT_LOCAL_WHISPER_DEVICE	cpu	gpu	-

Notes

If you still want to use free services with better result, please use:
- VoyageAI for embedding - Free to use 50M tokens without limitation.
- Gemini 1.5 Flash for QA - Free with rate limit.

Prefer ENV for Free (With Limitation)

If your pc has nvidia GPU, use "Recommendation" settings.

Name	Value	Recommendation	Note
AT_LOCAL_WHISPER_MODEL	base	large-v3	-
AT_LOCAL_WHISPER_DEVICE	cpu	gpu	-
AT_VOYAGEAI_API_KEY	[enter-your-voyageai-api-key ]	[enter-your-voyageai-api-key ]	-
AT_VOYAGEAI_EMBEDDING_MODEL	voyage-large-2	voyage-large-2	-
AT_GEMINI_API_KEY	[enter-your-gemini-api-key]	[enter-your-gemini-api-key]	-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine

engine

README.md

Getting Started

Setup development env

Configuration & Install

Configuration

Install libraries

Run

Env configuration

Prefer ENV for running LOCAL

Notes

Prefer ENV for Free (With Limitation)

Files

engine

Directory actions

More options

Directory actions

More options

Latest commit

History

engine

Folders and files

parent directory

README.md

Getting Started

Setup development env

Configuration & Install

Configuration

Install libraries

Run

Env configuration

Prefer ENV for running LOCAL

Notes

Prefer ENV for Free (With Limitation)