Contributors Guide

Setup

Follow steps of llama_cpp_canister/README/Getting Started

How to debug original llama.cpp

clone ggerganov/llama.cpp
checkout the proper commit used as root of the onicai branch in llama_cpp_onicai_fork

run these commands:

make clean
make LLAMA_DEBUG=1 llama-cli

Then, debug using this .vscode/launch.json

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "type": "lldb",
            "request": "launch",
            "name": "llama-cli",
            "program": "${workspaceFolder}/llama-cli",
            "cwd": "${workspaceFolder}",
            "args": [
                "-m",
                "<PATH_TO>/llama_cpp_canister_models/stories260Ktok512.gguf",
                "--samplers",
                "top_p",
                "--temp",
                "0.1",
                "--top-p",
                "0.9",
                "-n",
                "600",
                "-p",
                "Joe loves writing stories"
            ]
        }
    ]
}

How to upgrade llama.cpp

Sync fork

In GitHub, Sync fork for master branch of https://github.com/onicai/llama_cpp_onicai_fork

llama_cpp_onicai_fork: setup a local branch

Take following steps locally:

git fetch
Copy src/llama_cpp_onicai_fork to <temp>/llama_cpp_onica_fork_<git-sha>

This is just as a reference. We will remove this folder once all done.
from master, create a new branch: onicai-<git-sha>

For git-sha, use the short commit sha from which we're branching.

Update all files

Unless something was drastically changed in llama.cpp, it is sufficient to just re-upgrade the files listed in icpp.toml, plus their header files.

As you do your upgrade, modify the descriptions below, to help with the next upgrade: We use meld for comparing the files.

cpp_paths

main_.cpp

meld main_.cpp llama_cpp_onicai_fork/examples/main/main.cpp

use main_ instead of main
A few items related to console & ctrl+C need to be outcommented

llama_cpp_onicai_fork/src/llama.cpp

add #include "ic_api.h"
replace throw std::runtime_error(format with IC_API::trap(std::string("RUNTIME ERROR: ") + format
replace throw with IC_API::trap
outcomment try - catch. The program will abrupt in case of thrown exceptions.
outcomment threading related items:
- #include <future>
- #include <mutex>
- #include <thread>
outcomment these functions completely:
- llama_tensor_quantize_internal
- llama_model_quantize_internal

llama_cpp_onicai_fork/src/llama-vocab.cpp

add #include "ic_api.h"
replace throw std::runtime_error(format with IC_API::trap(std::string("RUNTIME ERROR: ") + format
outcomment try - catch. The program will abrupt in case of thrown exceptions.

add a check on llama_token_bos(model), else the llama2.c models never stop generating:

bool llama_token_is_eog_impl(const struct llama_vocab & vocab, llama_token token) {
    return token != -1 && (
        token == llama_token_eos_impl(vocab) ||
        token == llama_token_eot_impl(vocab) || 
        token == llama_token_bos_impl(vocab) // ICPP-PATCH: the llama2.c model predicts bos without first predicting an eos
    );
}

llama_cpp_onicai_fork/src/llama-grammar.cpp

No changes needed

llama_cpp_onicai_fork/src/llama-sampling.cpp

No changes needed

llama_cpp_onicai_fork/src/unicode-data.cpp

no modifications needed for the IC

llama_cpp_onicai_fork/src/unicode.cpp

add #include "ic_api.h"
replace throw with IC_API::trap

llama_cpp_onicai_fork/common/json-schema-to-grammar.cpp

add #include "ic_api.h"
replace throw with IC_API::trap
outcomment try - catch. The program will abrupt in case of thrown exceptions.

llama_cpp_onicai_fork/common/build-info.cpp

run this command to create it:

make build-info-cpp-wasm

llama_cpp_onicai_fork/common/grammar-parser.cpp

add #include "ic_api.h"
replace throw with IC_API::trap
outcomment try - catch. The program will abrupt in case of thrown exceptions.

llama_cpp_onicai_fork/common/sampling.cpp

add #include "ic_api.h"
replace throw with IC_API::trap

llama_cpp_onicai_fork/common/common.cpp

add #include "ic_api.h"
replace throw with IC_API::trap
outcomment all code related to <pthread.h>
outcomment try - catch. The program will abrupt in case of thrown exceptions.
outcomment std::getenv

c_paths

llama_cpp_onicai_fork/ggml/src/ggml.c

outcomment all code related to signals
- #include <signal.h>
Many threading outcomments.

llama_cpp_onicai_fork/ggml/src/ggml-alloc.c

No updates needed for icpp-pro

llama_cpp_onicai_fork/ggml/src/ggml-backend.c

No updates needed for icpp-pro

llama_cpp_onicai_fork/ggml/src/ggml-quants.c

No updates needed for icpp-pro

llama_cpp_onicai_fork/ggml/src/ggml-aarch64.c

No updates needed for icpp-pro

headers to modify

llama_cpp_onicai_fork/common/log.h

#include <thread>
Some other threading code

llama_cpp_onicai_fork/common/common.h

#include <thread>

llama_cpp_onicai_fork: replace `onicai` branch

Do NOT merge the onicai-<git-sha> branch into the onicai branch, but replace it:

git branch -m onicai onicai-<old-git-sha>
git branch -m onicai-<git-sha> onicai
git push origin onicai:onicai
git push origin onicai-<old-git-sha>:onicai-<old-git-sha>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README-contributors-guide.md

README-contributors-guide.md

Contributors Guide

Setup

How to debug original llama.cpp

How to upgrade llama.cpp

Sync fork

llama_cpp_onicai_fork: setup a local branch

Update all files

cpp_paths

main_.cpp

llama_cpp_onicai_fork/src/llama.cpp

llama_cpp_onicai_fork/src/llama-vocab.cpp

llama_cpp_onicai_fork/src/llama-grammar.cpp

llama_cpp_onicai_fork/src/llama-sampling.cpp

llama_cpp_onicai_fork/src/unicode-data.cpp

llama_cpp_onicai_fork/src/unicode.cpp

llama_cpp_onicai_fork/common/json-schema-to-grammar.cpp

llama_cpp_onicai_fork/common/build-info.cpp

llama_cpp_onicai_fork/common/grammar-parser.cpp

llama_cpp_onicai_fork/common/sampling.cpp

llama_cpp_onicai_fork/common/common.cpp

c_paths

llama_cpp_onicai_fork/ggml/src/ggml.c

llama_cpp_onicai_fork/ggml/src/ggml-alloc.c

llama_cpp_onicai_fork/ggml/src/ggml-backend.c

llama_cpp_onicai_fork/ggml/src/ggml-quants.c

llama_cpp_onicai_fork/ggml/src/ggml-aarch64.c

headers to modify

llama_cpp_onicai_fork/common/log.h

llama_cpp_onicai_fork/common/common.h

llama_cpp_onicai_fork: replace `onicai` branch

Files

README-contributors-guide.md

Latest commit

History

README-contributors-guide.md

File metadata and controls

Contributors Guide

Setup

How to debug original llama.cpp

How to upgrade llama.cpp

Sync fork

llama_cpp_onicai_fork: setup a local branch

Update all files

cpp_paths

main_.cpp

llama_cpp_onicai_fork/src/llama.cpp

llama_cpp_onicai_fork/src/llama-vocab.cpp

llama_cpp_onicai_fork/src/llama-grammar.cpp

llama_cpp_onicai_fork/src/llama-sampling.cpp

llama_cpp_onicai_fork/src/unicode-data.cpp

llama_cpp_onicai_fork/src/unicode.cpp

llama_cpp_onicai_fork/common/json-schema-to-grammar.cpp

llama_cpp_onicai_fork/common/build-info.cpp

llama_cpp_onicai_fork/common/grammar-parser.cpp

llama_cpp_onicai_fork/common/sampling.cpp

llama_cpp_onicai_fork/common/common.cpp

c_paths

llama_cpp_onicai_fork/ggml/src/ggml.c

llama_cpp_onicai_fork/ggml/src/ggml-alloc.c

llama_cpp_onicai_fork/ggml/src/ggml-backend.c

llama_cpp_onicai_fork/ggml/src/ggml-quants.c

llama_cpp_onicai_fork/ggml/src/ggml-aarch64.c

headers to modify

llama_cpp_onicai_fork/common/log.h

llama_cpp_onicai_fork/common/common.h

llama_cpp_onicai_fork: replace onicai branch

llama_cpp_onicai_fork: replace `onicai` branch