Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : accept extra_context for the infill endpoint #9874

Merged
merged 3 commits into from
Oct 13, 2024
Merged

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Oct 13, 2024

Pass additional (extra) context to the /infill endpoint:

curl \
    --silent --no-buffer --request POST \
    --url http://127.0.0.1:8012/infill \
    --header "Content-Type: application/json" \
    --data '{"extra_context": [{"filename": "llama.h", "text": "LLAMA_API int32_t llama_n_threads(struct llama_context * ctx);\n"}], "input_suffix": "}\n", "input_prefix": "#include <cstdio>\n#include \"llama.h\"\n\nint main() {\n    int n_threads = ", "prompt": ""}' | jq

...

{
    ...
    "content": "llama_n_threads(nullptr);\n    printf(\"Number of threads: %d\\n\", n_threads);\n    return 0;\n",
    ...
}

The "extra_context" field is an array of {"filename": string, "text": string} objects.

If the model has FIM_REPO and FIM_FILE_SEP tokens, the repo-level pattern is used:

<FIM_REP>myproject
<FIM_SEP>{chunk 0 filename}
{chunk 0 text}
<FIM_SEP>{chunk 1 filename}
{chunk 1 text}
...
<FIM_SEP>filename
<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]

If the tokens are missing, then the extra context is simply prefixed at the start:

[extra_context]<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]

In this case, the elements of the "extra_context" array are concatenated by separating them with the string:

--- snippet ---

The extra context can be used to implement a ring-buffered context for FIM completion that can be efficiently reused via #9866.

@ggerganov ggerganov merged commit d4c19c0 into master Oct 13, 2024
57 checks passed
@ggerganov ggerganov deleted the gg/infill-1 branch October 13, 2024 18:31
@ggerganov ggerganov mentioned this pull request Oct 15, 2024
7 tasks
drollings pushed a commit to drollings/llama.cpp that referenced this pull request Oct 18, 2024
* server : accept extra_context for the infill endpoint

ggml-ci

* server : update readme [no ci]

* server : use repo-level FIM pattern if possible

ggml-ci
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* server : accept extra_context for the infill endpoint

ggml-ci

* server : update readme [no ci]

* server : use repo-level FIM pattern if possible

ggml-ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant