GRPC Service Not Ready #2220

jonty-esterhuizen · 2024-05-02T10:04:55Z

Einstein-v6.1-Llama3-8B-Q4_K_M.gguf

Environment, CPU architecture, OS, and Version:

Running on Unraid

Model:	Custom
M/B:	Intel Corporation S2600CP Version E99552-510 s/n QSCP34600258
CPU:	Intel® Xeon® CPU E5-2630 0 @ 2.30GHz
HVM:	Enabled
IOMMU:	Enabled
Cache:	L1-Cache: 384 KiB, L2-Cache: 1536 KiB, L3-Cache: 15 MiB
Memory:	44 GiB DDR3 Multi-bit ECC (max. installable capacity 512 GiB)
Network:	bond0: fault-tolerance (active-backup), mtu 1500

Describe the bug

Encountered a server error with the message "rpc error: code = Unknown desc = unimplemented" when attempting to access the /v1/chat/completions endpoint. This happened despite the server and services appearing to be ready and operational as indicated by previous log entries.

To Reproduce

Start the LocalAI server with the following configuration settings: {list any specific configurations or settings used}
Send a POST request to /v1/chat/completions
Observe the error in the logs
Expected behavior

Expected the server to handle the POST request to /v1/chat/completions without errors, returning a successful response.

Logs

2:59AM DBG GRPC(Einstein-v6.1-Llama3-8B-Q4_K_M.gguf-127.0.0.1:44779): stderr 2024/05/02 02:59:49 gRPC Server listening at 127.0.0.1:44779
2:59AM DBG GRPC Service Ready
2:59AM DBG GRPC: Loading model with options: {lengthy configuration details here...}
2:59AM INF [stablediffusion] Loads OK
2:59AM ERR Server error error="rpc error: code = Unknown desc = unimplemented" ip=192.168.0.60 latency=1m2.252460241s method=POST status=500 url=/v1/chat/completions
2:59AM INF Success ip=127.0.0.1 latency="41.812µs" method=GET status=200 url=/readyz
Additional context

Running on an Unraid system which might be relevant in terms of the operating system environment or specific configurations.

localai-bot · 2024-05-02T10:06:07Z

Hello, thank you for reporting this issue in LocalAI. It seems you're experiencing an error when accessing the /v1/chat/completions endpoint with the "rpc error: code = Unknown desc = unimplemented" message. Based on the provided information, it looks like the GRPC server is readying up successfully, but the error occurs afterward.

First, it might be helpful to confirm that this issue is consistent by checking if it happens with other endpoints as well, or if it's specific to the /v1/chat/completions endpoint.

In order to investigate this issue further, please provide the following details:

Ensure that you are using the latest version of LocalAI by checking the version in the Docker image you're using or the git commit hash if you're running from source.
Verify that the issue persists after restarting the LocalAI server and Unraid system.
Check the configurations and settings you're using for the LocalAI server and ensure they align with the recommended setup.
Ensure that there aren't any network issues between the client and server that could be causing the problem.

Once you provide this information, we'll be better equipped to troubleshoot and provide a solution to the issue you're encountering.

Note: This response is from @mudler's experiment bot, an AI experiment by the owner of this project.

Current labels: [bug unconfirmed]

Tags added: [investigate, network issues, general troubleshooting, server setup]

jonty-esterhuizen · 2024-05-02T10:15:18Z

i ensured that i am using the latest
i restarted the whole server
i checked the configuration and network

3:13AM INF Trying to load the model '30f19017f38ab930fb78ec796b84f457' with all the available backends: llama-cpp, llama-ggml, gpt4all, bert-embeddings, rwkv, whisper, stablediffusion, tinydream, piper, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/transformers/run.sh, /build/backend/python/autogptq/run.sh, /build/backend/python/bark/run.sh, /build/backend/python/vall-e-x/run.sh, /build/backend/python/coqui/run.sh, /build/backend/python/rerankers/run.sh, /build/backend/python/diffusers/run.sh, /build/backend/python/parler-tts/run.sh, /build/backend/python/mamba/run.sh, /build/backend/python/exllama2/run.sh, /build/backend/python/sentencetransformers/run.sh, /build/backend/python/vllm/run.sh, /build/backend/python/transformers-musicgen/run.sh, /build/backend/python/petals/run.sh, /build/backend/python/exllama/run.sh
3:13AM INF [llama-cpp] Attempting to load
3:13AM INF Loading model '30f19017f38ab930fb78ec796b84f457' with backend llama-cpp
3:13AM DBG Loading model in memory from file: /build/models/30f19017f38ab930fb78ec796b84f457
3:13AM DBG Loading Model 30f19017f38ab930fb78ec796b84f457 with gRPC (file: /build/models/30f19017f38ab930fb78ec796b84f457) (backend: llama-cpp): {backendString:llama-cpp model:30f19017f38ab930fb78ec796b84f457 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000395200 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
3:13AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
3:13AM DBG GRPC Service for 30f19017f38ab930fb78ec796b84f457 will be running at: '127.0.0.1:37041'
3:13AM DBG GRPC Service state dir: /tmp/go-processmanager2553461172
3:13AM DBG GRPC Service Started
3:13AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:37041: connect: connection refused""
3:13AM DBG GRPC Service NOT ready

paulczar · 2024-05-03T15:34:15Z

I get the same errors for both Docker and the binary for multiple models.

jonty-esterhuizen · 2024-05-08T20:11:23Z

after using the latest and testing the issue still persists

I am running this in an Unraid environment

1:03PM INF [llama-cpp] Attempting to load
1:03PM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend llama-cpp
1:03PM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:40711: connect: connection refused""
1:04PM INF [llama-cpp] Fails: grpc service not ready
1:04PM INF [llama-ggml] Attempting to load
1:04PM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend llama-ggml
1:04PM INF [llama-ggml] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
1:04PM INF [gpt4all] Attempting to load
1:04PM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend gpt4all
1:04PM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
1:04PM INF [bert-embeddings] Attempting to load
1:04PM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend bert-embeddings
1:04PM INF [bert-embeddings] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
1:04PM INF [rwkv] Attempting to load
1:04PM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend rwkv
1:04PM INF [rwkv] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
1:04PM INF [whisper] Attempting to load

maxi1134 · 2024-05-10T14:58:09Z

I still get this error in the latest version..

Is it still an issue?

mudler · 2024-05-10T16:57:19Z

can you please share the full log with DEBUG=true in the environment variables?

maxi1134 · 2024-05-10T18:09:51Z

can you please share the full log with DEBUG=true in the environment variables?

I was apparently missing AVX in that VM; Sorry for that! The error message really threw me off!

mudler · 2024-05-13T09:37:09Z

can you please share the full log with DEBUG=true in the environment variables?

I was apparently missing AVX in that VM; Sorry for that! The error message really threw me off!

ouch - good point actually as it made me review this closely. seems I've missed to disable AVX in the llama-cpp fallback. Going to add it so we should get this sorted out once for all =)

mudler · 2024-05-13T09:39:14Z

edit: #2306 seems already having a fix for it!

maxi1134 · 2024-05-13T16:02:07Z

Thanks for the fix!

jonty-esterhuizen added bug Something isn't working unconfirmed labels May 2, 2024

mudler mentioned this issue May 3, 2024

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

Merged

mudler closed this as completed in #2232 May 4, 2024

CyberGWJ mentioned this issue Jul 3, 2024

LocalAI returns Server error error="could not load model: rpc error: code = Unavailable desc = error reading from server: EOF" #2692

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRPC Service Not Ready #2220

GRPC Service Not Ready #2220

jonty-esterhuizen commented May 2, 2024

localai-bot commented May 2, 2024

jonty-esterhuizen commented May 2, 2024

paulczar commented May 3, 2024

jonty-esterhuizen commented May 8, 2024

maxi1134 commented May 10, 2024

mudler commented May 10, 2024

maxi1134 commented May 10, 2024

mudler commented May 13, 2024 •

edited

Loading

mudler commented May 13, 2024

maxi1134 commented May 13, 2024

GRPC Service Not Ready #2220

GRPC Service Not Ready #2220

Comments

jonty-esterhuizen commented May 2, 2024

localai-bot commented May 2, 2024

jonty-esterhuizen commented May 2, 2024

paulczar commented May 3, 2024

jonty-esterhuizen commented May 8, 2024

maxi1134 commented May 10, 2024

mudler commented May 10, 2024

maxi1134 commented May 10, 2024

mudler commented May 13, 2024 • edited Loading

mudler commented May 13, 2024

maxi1134 commented May 13, 2024

mudler commented May 13, 2024 •

edited

Loading