Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add VDMS retriever microservice for v0.9 Milestone #539

Merged
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
3d242b8
add VDMS retriever microservice
s-gobriel Aug 21, 2024
5840706
add retrieval gateway and logger back to init
s-gobriel Aug 21, 2024
b1b7f4c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 21, 2024
f01ad04
Merge branch 'main' into sameh-vdms-retriever
chensuyue Aug 21, 2024
b690ea0
use 5009 in CI
BaoHuiling Aug 22, 2024
3f8b23f
Merge branch 'main' into sameh-vdms-retriever
BaoHuiling Aug 22, 2024
9991706
change index_name to collection_name
s-gobriel Aug 26, 2024
287d75a
fix var name
BaoHuiling Aug 27, 2024
023a074
Merge branch 'main' into sameh-vdms-retriever
BaoHuiling Aug 27, 2024
dbbb415
use index name all
BaoHuiling Aug 27, 2024
77b7b10
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 27, 2024
e83541b
add deps
BaoHuiling Aug 27, 2024
2118078
Merge branch 'main' into sameh-vdms-retriever
chensuyue Aug 29, 2024
0501be5
changes to address code reviews
s-gobriel Aug 30, 2024
0f2262f
Merge branch 'main' into sameh-vdms-retriever
s-gobriel Aug 30, 2024
838d1bd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 30, 2024
7586d75
Merge branch 'main' into sameh-vdms-retriever
s-gobriel Aug 30, 2024
72ce08c
resolve docarray
s-gobriel Aug 30, 2024
7b727f3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 30, 2024
d45e5a1
add optional docarray embeddoc constraints
s-gobriel Aug 31, 2024
f834685
Merge remote-tracking branch 'origin/sameh-vdms-retriever' into sameh…
s-gobriel Aug 31, 2024
34dd041
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 31, 2024
c7fb322
fix bug in comment
BaoHuiling Sep 3, 2024
2c89e26
Merge branch 'main' into sameh-vdms-retriever
BaoHuiling Sep 3, 2024
45079a2
import DEBUG
BaoHuiling Sep 3, 2024
5dfc745
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
f845e6c
Merge branch 'main' into sameh-vdms-retriever
XuhuiRen Sep 3, 2024
4ad93dc
Merge branch 'main' into sameh-vdms-retriever
s-gobriel Sep 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions comps/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
GeneratedDoc,
LLMParamsDoc,
SearchedDoc,
SearchedMultimodalDoc,
RerankedDoc,
TextDoc,
RAGASParams,
Expand Down
21 changes: 20 additions & 1 deletion comps/cores/proto/docarray.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from typing import Dict, List, Optional, Union
from typing import Dict, List, Optional, Tuple, Union

import numpy as np
from docarray import BaseDoc, DocList
Expand All @@ -20,6 +20,14 @@ class TextDoc(BaseDoc, TopologyInfo):
text: str


class ImageDoc(BaseDoc):
image_path: str


class TextImageDoc(BaseDoc):
doc: Tuple[Union[TextDoc, ImageDoc]]


class Base64ByteStrDoc(BaseDoc):
byte_str: str

Expand All @@ -41,6 +49,7 @@ class EmbedDoc(BaseDoc):
fetch_k: int = 20
lambda_mult: float = 0.5
score_threshold: float = 0.2
constraints: dict = None


class Audio2TextDoc(AudioDoc):
Expand All @@ -67,6 +76,16 @@ class Config:
json_encoders = {np.ndarray: lambda x: x.tolist()}


class SearchedMultimodalDoc(BaseDoc):
retrieved_docs: List[TextImageDoc]
initial_query: str
top_n: int = 1
metadata: Optional[List[Dict]] = None

class Config:
json_encoders = {np.ndarray: lambda x: x.tolist()}


class GeneratedDoc(BaseDoc):
text: str
prompt: str
Expand Down
10 changes: 7 additions & 3 deletions comps/retrievers/langchain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,18 @@ The service primarily utilizes similarity measures in vector space to rapidly re

Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial.

## Retriever Microservice with Redis
# Retriever Microservice with Redis

For details, please refer to this [readme](redis/README.md)

## Retriever Microservice with Milvus
# Retriever Microservice with Milvus

For details, please refer to this [readme](milvus/README.md)

## Retriever Microservice with PGVector
# Retriever Microservice with PGVector

For details, please refer to this [readme](pgvector/README.md)

# Retriever Microservice with VDMS

For details, please refer to this [readme](vdms/README.md)
152 changes: 152 additions & 0 deletions comps/retrievers/langchain/vdms/README.md
XuhuiRen marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Retriever Microservice

This retriever microservice is a highly efficient search service designed for handling and retrieving embedding vectors. It operates by receiving an embedding vector as input and conducting a similarity search against vectors stored in a VectorDB database. Users must specify the VectorDB's host, port, and the index/collection name, and the service searches within that index to find documents with the highest similarity to the input vector.

The service primarily utilizes similarity measures in vector space to rapidly retrieve contentually similar documents. The vector-based retrieval approach is particularly suited for handling large datasets, offering fast and accurate search results that significantly enhance the efficiency and quality of information retrieval.

Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial.

# 🚀1. Start Microservice with Python (Option 1)

To start the retriever microservice, you must first install the required python packages.

## 1.1 Install Requirements

```bash
pip install -r requirements.txt
```

## 1.2 Start TEI Service

```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=${your_langchain_api_key}
export LANGCHAIN_PROJECT="opea/retriever"
model=BAAI/bge-base-en-v1.5
revision=refs/pr/4
volume=$PWD/data
docker run -d -p 6060:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.2 --model-id $model --revision $revision
```

## 1.3 Verify the TEI Service

```bash
curl 127.0.0.1:6060/rerank \
XuhuiRen marked this conversation as resolved.
Show resolved Hide resolved
-X POST \
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
-H 'Content-Type: application/json'
```

## 1.4 Setup VectorDB Service

You need to setup your own VectorDB service (VDMS in this example), and ingest your knowledge documents into the vector database.

As for VDMS, you could start a docker container using the following commands.
Remember to ingest data into it manually.

```bash
docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest
```

## 1.5 Start Retriever Service

```bash
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
python langchain/retriever_vdms.py
```

# 🚀2. Start Microservice with Docker (Option 2)

## 2.1 Setup Environment Variables

```bash
export RETRIEVE_MODEL_ID="BAAI/bge-base-en-v1.5"
export COLLECTION_NAME=${your_index_name}
export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060"
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=${your_langchain_api_key}
export LANGCHAIN_PROJECT="opea/retrievers"
```

## 2.2 Build Docker Image

```bash
cd ../../
docker build -t opea/retriever-vdms:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/langchain/vdms/docker/Dockerfile .
```

To start a docker container, you have two options:

- A. Run Docker with CLI
- B. Run Docker with Docker Compose

You can choose one as needed.

## 2.3 Run Docker with CLI (Option A)

```bash
docker run -d --name="retriever-vdms-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e INDEX_NAME=$COLLECTION_NAME -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/retriever-vdms:latest
```

## 2.4 Run Docker with Docker Compose (Option B)

```bash
cd langchain/vdms/docker
docker compose -f docker_compose_retriever.yaml up -d
```

# 🚀3. Consume Retriever Service

## 3.1 Check Service Status

```bash
curl http://localhost:7000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
```

## 3.2 Consume Embedding Service

To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${your_ip}:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```

You can set the parameters for the retriever.

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity\", \"k\":4}" \
-H 'Content-Type: application/json'
```

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_distance_threshold\", \"k\":4, \"distance_threshold\":1.0}" \
-H 'Content-Type: application/json'
```

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_score_threshold\", \"k\":4, \"score_threshold\":0.2}" \
-H 'Content-Type: application/json'
```

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"mmr\", \"k\":4, \"fetch_k\":20, \"lambda_mult\":0.5}" \
-H 'Content-Type: application/json'
```
2 changes: 2 additions & 0 deletions comps/retrievers/langchain/vdms/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
49 changes: 49 additions & 0 deletions comps/retrievers/langchain/vdms/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM langchain/langchain:latest

ARG ARCH="cpu"

RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
libgl1-mesa-glx \
libjemalloc-dev \
iputils-ping \
vim

RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

COPY comps /home/user/comps

# RUN chmod +x /home/user/comps/retrievers/langchain/vdms/run.sh

USER user
RUN pip install --no-cache-dir --upgrade pip && \
if [ ${ARCH} = "cpu" ]; then pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu; fi && \
pip install --no-cache-dir -r /home/user/comps/retrievers/langchain/vdms/requirements.txt

RUN pip install -U langchain
RUN pip install -U langchain-community

RUN pip install --upgrade huggingface-hub

ENV PYTHONPATH=$PYTHONPATH:/home/user

ENV HUGGINGFACEHUB_API_TOKEN=dummy

ENV USECLIP 0

ENV no_proxy=localhost,127.0.0.1

ENV http_proxy=""
ENV https_proxy=""

WORKDIR /home/user/comps/retrievers/langchain/vdms

#ENTRYPOINT ["/home/user/comps/retrievers/langchain/vdms/run.sh"]
#ENTRYPOINT ["/bin/bash"]

ENTRYPOINT ["python", "retriever_vdms.py"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

version: "3.8"

services:
tei_xeon_service:
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.2
lvliang-intel marked this conversation as resolved.
Show resolved Hide resolved
container_name: tei-xeon-server
ports:
- "6060:80"
volumes:
- "./data:/data"
shm_size: 1g
command: --model-id ${RETRIEVE_MODEL_ID}
retriever:
image: opea/retriever-vdms:latest
container_name: retriever-vdms-server
ports:
- "7000:7000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTION_NAME: ${COLLECTION_NAME}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
restart: unless-stopped

networks:
default:
driver: bridge
Loading
Loading