-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc error: code = Unknown desc = unimplemented instead of chat completion #1946
Comments
ouch, most likely the model is too big for the available GPU ram as there isn't still a smaller profile available (yet). For the moment you can try to enforce a different profile with setting for instance Also, set the
I can pull the images in https://localai.io/docs/reference/aio-images/, e.g.:
It is on purpose to automatically map with existing tools that expect the models named like the ones available in OpenAI. The AIO image is an opinionated image which pre-configures OSS models to show like the proprietary ones, however you are free to configure your LocalAI instance with any model you like: https://localai.io/docs/getting-started/run-other-models/ |
I have the same issue. Below the debug log
|
related to #1909 ? |
Hi @chino-lu , about your issue the error is here:
Trying to allocate >5G on a board with only 4GB. Reduce GPU offload playing with the |
Thank you for providing the error details. To resolve this issue, you can indeed reduce the For example, you can try reducing the {
"model": "stablediffusion::stablediffusion-v1-5c7cd056ecf9a4bb5b527410b97f48cb",
"settings": {
...
"gpu_layers": 256,
...
}
} Remember to adjust this value based on your specific requirements and available GPU memory. This should help avoid the out of memory error. |
I got it working, but not the way I would like. By moving from the AIO container to the normal one, settings will not be overwritten anymore and it is working (but not with 256. With 11 I am now on working settings, not sure what would be the upper limit. But I have to admit that other things are running on that graphics card as well) |
@chino-lu the AIO image contains pre-configured, opinionated models. In the case of GPU it defaults to automatically offload all the GPUs. To customize the model settings you can follow https://localai.io/docs/getting-started/customize-model/ |
LocalAI version:
docker run -p 8080:8080 --gpus all --name local-ai -ti localai/localai:v2.11.0-aio-gpu-nvidia-cuda-12
Environment, CPU architecture, OS, and Version:
Linux rumpel 6.8.2-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 28 Mar 2024 17:06:35 +0000 x86_64 GNU/Linux
Describe the bug
I am trying to run the example but am not getting an answer but an error:
Expected behavior
An answer instead of an error.
Logs
In the docker console I see this:
Additional context
It seems like it can't load the model. But I have no idea why.
Some side nodes:
The text was updated successfully, but these errors were encountered: