-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPC Service Not Ready #2220
Comments
Hello, thank you for reporting this issue in LocalAI. It seems you're experiencing an error when accessing the /v1/chat/completions endpoint with the "rpc error: code = Unknown desc = unimplemented" message. Based on the provided information, it looks like the GRPC server is readying up successfully, but the error occurs afterward. First, it might be helpful to confirm that this issue is consistent by checking if it happens with other endpoints as well, or if it's specific to the /v1/chat/completions endpoint. In order to investigate this issue further, please provide the following details:
Once you provide this information, we'll be better equipped to troubleshoot and provide a solution to the issue you're encountering. Note: This response is from @mudler's experiment bot, an AI experiment by the owner of this project. Current labels: [bug unconfirmed] Tags added: [investigate, network issues, general troubleshooting, server setup] |
i ensured that i am using the latest 3:13AM INF Trying to load the model '30f19017f38ab930fb78ec796b84f457' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/transformers/run.sh, /build/backend/python/autogptq/run.sh, /build/backend/python/bark/run.sh, /build/backend/python/vall-e-x/run.sh, /build/backend/python/coqui/run.sh, /build/backend/python/rerankers/run.sh, /build/backend/python/diffusers/run.sh, /build/backend/python/parler-tts/run.sh, /build/backend/python/mamba/run.sh, /build/backend/python/exllama2/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/vllm/run.sh, /build/backend/python/transformers-musicgen/run.sh, /build/backend/python/petals/run.sh, /build/backend/python/exllama/run.sh |
I get the same errors for both Docker and the binary for multiple models. |
after using the latest and testing the issue still persists I am running this in an Unraid environment 1:03PM INF [llama-cpp] Attempting to load |
I still get this error in the latest version.. Is it still an issue? |
can you please share the full log with |
I was apparently missing AVX in that VM; Sorry for that! The error message really threw me off! |
ouch - good point actually as it made me review this closely. seems I've missed to disable AVX in the llama-cpp fallback. Going to add it so we should get this sorted out once for all =) |
edit: #2306 seems already having a fix for it! |
Thanks for the fix! |
Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
Environment, CPU architecture, OS, and Version:
Running on Unraid
Describe the bug
Encountered a server error with the message "rpc error: code = Unknown desc = unimplemented" when attempting to access the /v1/chat/completions endpoint. This happened despite the server and services appearing to be ready and operational as indicated by previous log entries.
To Reproduce
Start the LocalAI server with the following configuration settings: {list any specific configurations or settings used}
Send a POST request to /v1/chat/completions
Observe the error in the logs
Expected behavior
Expected the server to handle the POST request to /v1/chat/completions without errors, returning a successful response.
Logs
2:59AM DBG GRPC(Einstein-v6.1-Llama3-8B-Q4_K_M.gguf-127.0.0.1:44779): stderr 2024/05/02 02:59:49 gRPC Server listening at 127.0.0.1:44779
2:59AM DBG GRPC Service Ready
2:59AM DBG GRPC: Loading model with options: {lengthy configuration details here...}
2:59AM INF [stablediffusion] Loads OK
2:59AM ERR Server error error="rpc error: code = Unknown desc = unimplemented" ip=192.168.0.60 latency=1m2.252460241s method=POST status=500 url=/v1/chat/completions
2:59AM INF Success ip=127.0.0.1 latency="41.812µs" method=GET status=200 url=/readyz
Additional context
Running on an Unraid system which might be relevant in terms of the operating system environment or specific configurations.
The text was updated successfully, but these errors were encountered: