Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use BAAI/bge-reranker-base model for reranking #2577

Open
shizidushu opened this issue Jun 16, 2024 · 0 comments
Open

Unable to use BAAI/bge-reranker-base model for reranking #2577

shizidushu opened this issue Jun 16, 2024 · 0 comments
Labels
bug Something isn't working unconfirmed

Comments

@shizidushu
Copy link

LocalAI version:
localai/localai:v2.16.0-cublas-cuda12

Environment, CPU architecture, OS, and Version:
Linux LAPTOP-LENOVO 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug
It tries to load mixedbread-ai/mxbai-rerank-base-v1 instead.

To Reproduce
Use the following yaml.

name: bge-reranker-base
backend: rerankers

parameters:
  model: cross-encoder
  pipeline_type: zh

After read https://github.com/AnswerDotAI/rerankers/blob/e53b2714a935937561c9045326ed19bfa5082129/rerankers/reranker.py#L12-L16 and

kwargs['lang'] = request.PipelineType
, I think set pipeline_type: zh will make BAAI/bge-reranker-base to be used. But it loads mixedbread-ai/mxbai-rerank-base-v1 instead.

Logs
Here is the output from docker compose up

localai-api-1  | 8:48AM DBG Extracting backend assets files to /tmp/localai/backend_data
localai-api-1  | 8:48AM DBG processing api keys runtime update
localai-api-1  | 8:48AM DBG processing external_backends.json
localai-api-1  | 8:48AM DBG external backends loaded from external_backends.json
localai-api-1  | 8:48AM INF core/startup process completed!
localai-api-1  | 8:48AM DBG No configuration file found at /tmp/localai/upload/uploadedFiles.json
localai-api-1  | 8:48AM DBG No configuration file found at /tmp/localai/config/assistants.json
localai-api-1  | 8:48AM DBG No configuration file found at /tmp/localai/config/assistantsFile.json
localai-api-1  | 8:48AM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080
localai-api-1  | 8:48AM DBG Request for model: cross-encoder
localai-api-1  | 8:48AM INF Loading model 'cross-encoder' with backend rerankers
localai-api-1  | 8:48AM DBG Loading model in memory from file: /models/cross-encoder
localai-api-1  | 8:48AM DBG Loading Model cross-encoder with gRPC (file: /models/cross-encoder) (backend: rerankers): {backendString:rerankers model:cross-encoder threads:0 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002006c8 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
localai-api-1  | 8:48AM DBG Loading external backend: /build/backend/python/rerankers/run.sh
localai-api-1  | 8:48AM DBG Loading GRPC Process: /build/backend/python/rerankers/run.sh
localai-api-1  | 8:48AM DBG GRPC Service for cross-encoder will be running at: '127.0.0.1:35699'
localai-api-1  | 8:48AM DBG GRPC Service state dir: /tmp/go-processmanager1405612238
localai-api-1  | 8:48AM DBG GRPC Service Started
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stdout Initializing libbackend for build
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stdout virtualenv activated
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stdout activated virtualenv has been ensured
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stderr /build/backend/python/rerankers/venv/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stderr   warnings.warn(
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stderr Server started. Listening on: 127.0.0.1:35699
localai-api-1  | 8:48AM DBG GRPC Service Ready
localai-api-1  | 8:48AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:cross-encoder ContextSize:512 Seed:2117973773 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/cross-encoder Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stderr /build/backend/python/rerankers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
localai-api-1  | 8:48AM DBG GRPC(cross-encoder-127.0.0.1:35699): stderr   warnings.warn(
localai-api-1  | 8:48AM ERR Server error error="could not load model (no success): Unexpected err=OSError(\"We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like mixedbread-ai/mxbai-rerank-base-v1 is not the path to a directory containing a file named config.json.\\nCheckout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.\"), type(err)=<class 'OSError'>" ip=172.23.0.1 latency=22.039144583s method=POST status=500 url=/v1/rerank
localai-api-1  | 8:49AM INF Success ip=127.0.0.1 latency="31.413µs" method=GET status=200 url=/readyz

Additional context

@shizidushu shizidushu added bug Something isn't working unconfirmed labels Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

1 participant