... Continual integration benchmarking for LLM inference engine llama.cpp
Spinning up GPU virtual machines can get expensive 💰 please proceed carefully...
ToDo...
- Azure cloud by Microsoft - we are targeting MS Azure exclusively (at this point)
- Azure CLI -
az
is command line tool to managed Azure infrastructure - node.js application logic, llama.cpp process control and workflow
- InfluxDB - time series database (TSDB) for logs and metrics
- CouchDB - simple, robust, noSQL database (distributable) - for config and test result summaries
- Telegraf - for gathering machine and process telemetry
- [Later] Grafana for telemetry visualisation
- [Later] Python & Jupyter for benchmark data analysis
Test results are collated and pushed to llama.cpp GitHub repo after each CI benchmarking run.
- always on small cloud VM
- Ubuntu 22.04
- node.js, nvm & PM2 to manage cron & processes
- InfluxDB database for logs & telemetry
- CouchDB database for config & bench result summaries
- Telegraf to gather machine telemetry
- polls Gituhb API periodically for latest llama.cpp release
- instructs Azure to create a GPU VM (⚙️bench-runner) and configures infrastructure etc
az
CLI commands - exports test result extracts and commits them to llama.cpp Github repo
- bench test data
- Azure machine image data
- result summaries
- llama.cpp stdout stream with timestamps
- nvidia-smi metrics
- cpu, gpu, ram metrics & other machine metrics
- emphemeral VM
- Ubuntu, Debian or Windows VM
- node.js, nvm & PM2
- nvidia drivers etc
- make, gcc etc
- Telegraf to gather machine telemetry & nvidia telemetry
node.js runs a managed node.js sub-process;
- node.js pulls git code - eg
git pull master-d7d2e6a
- node.s builds the code - eg
make clean && make -j
- node.js runs llama.cpp - eg
time ./perplexity -m ./models/3B/open-llama-3b-q4_0.bin -f build/wiki.test.raw.406 -t 8
- node.js sends
stdout
to 📂InfuxDB - node.js sends llama.cpp process results to 📂CouchDB
- [optionally] node.js switches to a different branch - GOTO #1
- node.js sends "bench session end" signal to ⚙️conductor / 📂CouchDB / 📂InfluxDB
- node.js shutsdown the VM
- env.sample
- az scripts to build ⚙️conductor
- az scripts to get Azure (a) regions (b) sub-regions (c) image-types...