Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : adjust default context size + print warnings #10136

Merged
merged 2 commits into from
Nov 2, 2024
Merged

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Nov 2, 2024

fix #8817, #9563 (comment)

By default, the examples will use a context size of 4096, instead of the training context of the model. In a lot of cases, the default training context can be very big - 32k to 128k tokens, which causes enormous KV cache allocation and failures for regular hardware.

Also, add warning logs when the specified context size per sequence does not match the training context.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This should prevent me from burning my swapfile whenever I forget to specify -c

Tested and it shows the log too:

> ./llama-cli -m ../models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf -cnv -p "You are a helpful assistant"
...
llama_new_context_with_model: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
...

@github-actions github-actions bot added the devops improvements to build systems and github actions label Nov 2, 2024
@ggerganov
Copy link
Owner Author

Is 4096 a good value, or should we go lower?

@ngxson
Copy link
Collaborator

ngxson commented Nov 2, 2024

According to HF hub statistics, the most used model nowadays is the llama 3 (3.1, 3.2) 8B

With a context size of 4096, the KV takes around 512MB which I think is a very reasonable amount.

@ggerganov ggerganov merged commit 1926d6e into master Nov 2, 2024
60 checks passed
@ggerganov ggerganov deleted the gg/default-ctx branch November 2, 2024 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions
Projects
None yet
3 participants