From a8a46bc1024cff1d48978903c367eb76a325b259 Mon Sep 17 00:00:00 2001 From: David Kinder Date: Thu, 5 Sep 2024 23:09:27 -0400 Subject: [PATCH] doc: fix heading levels in markdown content (#627) * only one H1 (#) heading for the title is allowed, so fix the extra H1 headings (and the subheadings under those) to appropriate levels * fix some inline code blocks containing leading/trailing spaces * fix some indenting issues under an ordered list item Signed-off-by: David B. Kinder --- comps/agent/langchain/README.md | 20 ++++---- comps/cores/telemetry/README.md | 2 +- comps/dataprep/redis/README.md | 2 +- .../redis/multimodal_langchain/README.md | 48 +++++++++---------- comps/embeddings/neural-speed/README.md | 12 +++-- comps/feedback_management/mongo/README.md | 12 ++--- comps/finetuning/README.md | 30 ++++++------ comps/guardrails/toxicity_detection/README.md | 20 ++++---- comps/intent_detection/README.md | 26 +++++----- comps/llms/text-generation/ollama/README.md | 4 +- .../text-generation/vllm-openvino/README.md | 14 ++++-- comps/lvms/video-llama/README.md | 10 ++-- comps/prompt_registry/mongo/README.md | 12 ++--- comps/reranks/video-rag-qna/README.md | 10 ++-- comps/retrievers/langchain/README.md | 8 ++-- comps/retrievers/langchain/vdms/README.md | 42 ++++++++-------- comps/vectorstores/README.md | 12 ++--- 17 files changed, 145 insertions(+), 139 deletions(-) diff --git a/comps/agent/langchain/README.md b/comps/agent/langchain/README.md index bc694cdf1..a411ddc45 100644 --- a/comps/agent/langchain/README.md +++ b/comps/agent/langchain/README.md @@ -33,34 +33,34 @@ The tools are registered with a yaml file. We support the following types of too Currently we have implemented OpenAI chat completion compatible API for agents. We are working to support OpenAI assistants APIs. -# ๐Ÿš€2. Start Agent Microservice +## ๐Ÿš€2. Start Agent Microservice -## 2.1 Option 1: with Python +### 2.1 Option 1: with Python -### 2.1.1 Install Requirements +#### 2.1.1 Install Requirements ```bash cd comps/agent/langchain/ pip install -r requirements.txt ``` -### 2.1.2 Start Microservice with Python Script +#### 2.1.2 Start Microservice with Python Script ```bash cd comps/agent/langchain/ python agent.py ``` -## 2.2 Option 2. Start Microservice with Docker +### 2.2 Option 2. Start Microservice with Docker -### 2.2.1 Build Microservices +#### 2.2.1 Build Microservices ```bash cd GenAIComps/ # back to GenAIComps/ folder docker build -t opea/comps-agent-langchain:latest -f comps/agent/langchain/docker/Dockerfile . ``` -### 2.2.2 Start microservices +#### 2.2.2 Start microservices ```bash export ip_address=$(hostname -I | awk '{print $1}') @@ -87,7 +87,7 @@ docker logs comps-langchain-agent-endpoint > docker run --rm --runtime=runc --name="comps-langchain-agent-endpoint" -v ./comps/agent/langchain/:/home/user/comps/agent/langchain/ -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${model} -e ip_address=${ip_address} -e strategy=react -e llm_endpoint_url=http://${ip_address}:8080 -e llm_engine=tgi -e recursion_limit=5 -e require_human_feedback=false -e tools=/home/user/comps/agent/langchain/tools/custom_tools.yaml opea/comps-agent-langchain:latest > ``` -# ๐Ÿš€ 3. Validate Microservice +## ๐Ÿš€ 3. Validate Microservice Once microservice starts, user can use below script to invoke. @@ -104,7 +104,7 @@ data: [DONE] ``` -# ๐Ÿš€ 4. Provide your own tools +## ๐Ÿš€ 4. Provide your own tools - Define tools @@ -180,7 +180,7 @@ data: 'The weather information in Austin is not available from the Open Platform data: [DONE] ``` -# 5. Customize agent strategy +## 5. Customize agent strategy For advanced developers who want to implement their own agent strategies, you can add a separate folder in `src\strategy`, implement your agent by inherit the `BaseAgent` class, and add your strategy into the `src\agent.py`. The architecture of this agent microservice is shown in the diagram below as a reference. ![Architecture Overview](agent_arch.jpg) diff --git a/comps/cores/telemetry/README.md b/comps/cores/telemetry/README.md index dda946647..d8b71b17a 100644 --- a/comps/cores/telemetry/README.md +++ b/comps/cores/telemetry/README.md @@ -8,7 +8,7 @@ OPEA Comps currently provides telemetry functionalities for metrics and tracing OPEA microservice metrics are exported in Prometheus format and are divided into two categories: general metrics and specific metrics. -General metrics, such as `http_requests_total `, `http_request_size_bytes`, are exposed by every microservice endpoint using the [prometheus-fastapi-instrumentator](https://github.com/trallnag/prometheus-fastapi-instrumentator). +General metrics, such as `http_requests_total`, `http_request_size_bytes`, are exposed by every microservice endpoint using the [prometheus-fastapi-instrumentator](https://github.com/trallnag/prometheus-fastapi-instrumentator). Specific metrics are the built-in metrics exposed under `/metrics` by each specific microservices such as TGI, vLLM, TEI and others. Both types of the metrics adhere to the Prometheus format. diff --git a/comps/dataprep/redis/README.md b/comps/dataprep/redis/README.md index 76361a236..440eb0d45 100644 --- a/comps/dataprep/redis/README.md +++ b/comps/dataprep/redis/README.md @@ -105,7 +105,7 @@ export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} - Build docker image with langchain -* option 1: Start single-process version (for 1-10 files processing) +- option 1: Start single-process version (for 1-10 files processing) ```bash cd ../../../../ diff --git a/comps/dataprep/redis/multimodal_langchain/README.md b/comps/dataprep/redis/multimodal_langchain/README.md index 19042c6ae..6744ec8dc 100644 --- a/comps/dataprep/redis/multimodal_langchain/README.md +++ b/comps/dataprep/redis/multimodal_langchain/README.md @@ -2,9 +2,9 @@ This `dataprep` microservice accepts videos (mp4 files) and their transcripts (optional) from the user and ingests them into Redis vectorstore. -# ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ +## ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ -## 1.1 Install Requirements +### 1.1 Install Requirements ```bash # Install ffmpeg static build @@ -17,11 +17,11 @@ cp $(pwd)/ffmpeg-git-amd64-static/ffmpeg /usr/local/bin/ pip install -r requirements.txt ``` -## 1.2 Start Redis Stack Server +### 1.2 Start Redis Stack Server Please refer to this [readme](../../../vectorstores/langchain/redis/README.md). -## 1.3 Setup Environment Variables +### 1.3 Setup Environment Variables ```bash export your_ip=$(hostname -I | awk '{print $1}') @@ -30,7 +30,7 @@ export INDEX_NAME=${your_redis_index_name} export PYTHONPATH=${path_to_comps} ``` -## 1.4 Start LVM Microservice (Optional) +### 1.4 Start LVM Microservice (Optional) This is required only if you are going to consume the _generate_captions_ API of this microservice as in [Section 4.3](#43-consume-generate_captions-api). @@ -42,7 +42,7 @@ export your_ip=$(hostname -I | awk '{print $1}') export LVM_ENDPOINT="http://${your_ip}:9399/v1/lvm" ``` -## 1.5 Start Data Preparation Microservice for Redis with Python Script +### 1.5 Start Data Preparation Microservice for Redis with Python Script Start document preparation microservice for Redis with below command. @@ -50,13 +50,13 @@ Start document preparation microservice for Redis with below command. python prepare_videodoc_redis.py ``` -# ๐Ÿš€2. Start Microservice with Docker (Option 2) +## ๐Ÿš€2. Start Microservice with Docker (Option 2) -## 2.1 Start Redis Stack Server +### 2.1 Start Redis Stack Server Please refer to this [readme](../../../vectorstores/langchain/redis/README.md). -## 2.2 Start LVM Microservice (Optional) +### 2.2 Start LVM Microservice (Optional) This is required only if you are going to consume the _generate_captions_ API of this microservice as described [here](#43-consume-generate_captions-api). @@ -68,7 +68,7 @@ export your_ip=$(hostname -I | awk '{print $1}') export LVM_ENDPOINT="http://${your_ip}:9399/v1/lvm" ``` -## 2.3 Setup Environment Variables +### 2.3 Setup Environment Variables ```bash export your_ip=$(hostname -I | awk '{print $1}') @@ -79,39 +79,39 @@ export INDEX_NAME=${your_redis_index_name} export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} ``` -## 2.4 Build Docker Image +### 2.4 Build Docker Image ```bash cd ../../../../ docker build -t opea/dataprep-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/multimodal_langchain/docker/Dockerfile . ``` -## 2.5 Run Docker with CLI (Option A) +### 2.5 Run Docker with CLI (Option A) ```bash docker run -d --name="dataprep-redis-server" -p 6007:6007 --runtime=runc --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e REDIS_URL=$REDIS_URL -e INDEX_NAME=$INDEX_NAME -e LVM_ENDPOINT=$LVM_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/dataprep-redis:latest ``` -## 2.6 Run with Docker Compose (Option B - deprecated, will move to genAIExample in future) +### 2.6 Run with Docker Compose (Option B - deprecated, will move to genAIExample in future) ```bash cd comps/dataprep/redis/multimodal_langchain/docker docker compose -f docker-compose-dataprep-redis.yaml up -d ``` -# ๐Ÿš€3. Status Microservice +## ๐Ÿš€3. Status Microservice ```bash docker container logs -f dataprep-redis-server ``` -# ๐Ÿš€4. Consume Microservice +## ๐Ÿš€4. Consume Microservice Once this dataprep microservice is started, user can use the below commands to invoke the microservice to convert videos and their transcripts (optional) to embeddings and save to the Redis vector store. This mircroservice has provided 3 different ways for users to ingest videos into Redis vector store corresponding to the 3 use cases. -## 4.1 Consume _videos_with_transcripts_ API +### 4.1 Consume _videos_with_transcripts_ API **Use case:** This API is used when a transcript file (under `.vtt` format) is available for each video. @@ -120,7 +120,7 @@ This mircroservice has provided 3 different ways for users to ingest videos into - Make sure the file paths after `files=@` are correct. - Every transcript file's name must be identical with its corresponding video file's name (except their extension .vtt and .mp4). For example, `video1.mp4` and `video1.vtt`. Otherwise, if `video1.vtt` is not included correctly in this API call, this microservice will return error `No captions file video1.vtt found for video1.mp4`. -### Single video-transcript pair upload +#### Single video-transcript pair upload ```bash curl -X POST \ @@ -130,7 +130,7 @@ curl -X POST \ http://localhost:6007/v1/videos_with_transcripts ``` -### Multiple video-transcript pair upload +#### Multiple video-transcript pair upload ```bash curl -X POST \ @@ -142,13 +142,13 @@ curl -X POST \ http://localhost:6007/v1/videos_with_transcripts ``` -## 4.2 Consume _generate_transcripts_ API +### 4.2 Consume _generate_transcripts_ API **Use case:** This API should be used when a video has meaningful audio or recognizable speech but its transcript file is not available. In this use case, this microservice will use [`whisper`](https://openai.com/index/whisper/) model to generate the `.vtt` transcript for the video. -### Single video upload +#### Single video upload ```bash curl -X POST \ @@ -157,7 +157,7 @@ curl -X POST \ http://localhost:6007/v1/generate_transcripts ``` -### Multiple video upload +#### Multiple video upload ```bash curl -X POST \ @@ -167,7 +167,7 @@ curl -X POST \ http://localhost:6007/v1/generate_transcripts ``` -## 4.3 Consume _generate_captions_ API +### 4.3 Consume _generate_captions_ API **Use case:** This API should be used when a video does not have meaningful audio or does not have audio. @@ -192,7 +192,7 @@ curl -X POST \ http://localhost:6007/v1/generate_captions ``` -## 4.4 Consume get_videos API +### 4.4 Consume get_videos API To get names of uploaded videos, use the following command. @@ -202,7 +202,7 @@ curl -X POST \ http://localhost:6007/v1/dataprep/get_videos ``` -## 4.5 Consume delete_videos API +### 4.5 Consume delete_videos API To delete uploaded videos and clear the database, use the following command. diff --git a/comps/embeddings/neural-speed/README.md b/comps/embeddings/neural-speed/README.md index d2d1fff72..e34545085 100644 --- a/comps/embeddings/neural-speed/README.md +++ b/comps/embeddings/neural-speed/README.md @@ -1,10 +1,12 @@ -# build Mosec endpoint docker image +# Embedding Neural Speed + +## build Mosec endpoint docker image ``` docker build --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -t langchain-mosec:neuralspeed -f comps/embeddings/neural-speed/neuralspeed-docker/Dockerfile . ``` -# build embedding microservice docker image +## build embedding microservice docker image ``` docker build --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -t opea/embedding-langchain-mosec:neuralspeed -f comps/embeddings/neural-speed/docker/Dockerfile . @@ -12,20 +14,20 @@ docker build --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_p Note: Please contact us to request model files before building images. -# launch Mosec endpoint docker container +## launch Mosec endpoint docker container ``` docker run -d --name="embedding-langchain-mosec-endpoint" -p 6001:8000 langchain-mosec:neuralspeed ``` -# launch embedding microservice docker container +## launch embedding microservice docker container ``` export MOSEC_EMBEDDING_ENDPOINT=http://{mosec_embedding_host_ip}:6001 docker run -d --name="embedding-langchain-mosec-server" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p 6000:6000 --ipc=host -e MOSEC_EMBEDDING_ENDPOINT=$MOSEC_EMBEDDING_ENDPOINT opea/embedding-langchain-mosec:neuralspeed ``` -# run client test +## run client test ``` curl localhost:6000/v1/embeddings \ diff --git a/comps/feedback_management/mongo/README.md b/comps/feedback_management/mongo/README.md index 2e43f81fd..1cc375bdb 100644 --- a/comps/feedback_management/mongo/README.md +++ b/comps/feedback_management/mongo/README.md @@ -34,15 +34,15 @@ docker build -t opea/feedbackmanagement-mongo-server:latest --build-arg https_pr 1. Run mongoDB image -```bash -docker run -d -p 27017:27017 --name=mongo mongo:latest -``` + ```bash + docker run -d -p 27017:27017 --name=mongo mongo:latest + ``` 2. Run Feedback Management service -```bash -docker run -d --name="feedbackmanagement-mongo-server" -p 6016:6016 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e MONGO_HOST=${MONGO_HOST} -e MONGO_PORT=${MONGO_PORT} -e DB_NAME=${DB_NAME} -e COLLECTION_NAME=${COLLECTION_NAME} opea/feedbackmanagement-mongo-server:latest -``` + ```bash + docker run -d --name="feedbackmanagement-mongo-server" -p 6016:6016 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e MONGO_HOST=${MONGO_HOST} -e MONGO_PORT=${MONGO_PORT} -e DB_NAME=${DB_NAME} -e COLLECTION_NAME=${COLLECTION_NAME} opea/feedbackmanagement-mongo-server:latest + ``` ### Invoke Microservice diff --git a/comps/finetuning/README.md b/comps/finetuning/README.md index a88339e05..11dd0d82e 100644 --- a/comps/finetuning/README.md +++ b/comps/finetuning/README.md @@ -2,9 +2,9 @@ LLM Fine-tuning microservice involves adapting a base model to a specific task or dataset to improve its performance on that task. -# ๐Ÿš€1. Start Microservice with Python (Optional 1) +## ๐Ÿš€1. Start Microservice with Python (Optional 1) -## 1.1 Install Requirements +### 1.1 Install Requirements ```bash python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu @@ -13,9 +13,9 @@ python -m pip install oneccl_bind_pt --extra-index-url https://pytorch-extension pip install -r requirements.txt ``` -## 1.2 Start Finetuning Service with Python Script +### 1.2 Start Finetuning Service with Python Script -### 1.2.1 Start Ray Cluster +#### 1.2.1 Start Ray Cluster OneCCL and Intel MPI libraries should be dynamically linked in every node before Ray starts: @@ -35,18 +35,18 @@ For a multi-node cluster, start additional Ray worker nodes with below command. ray start --address='${head_node_ip}:6379' ``` -### 1.2.2 Start Finetuning Service +#### 1.2.2 Start Finetuning Service ```bash export HF_TOKEN=${your_huggingface_token} python finetuning_service.py ``` -# ๐Ÿš€2. Start Microservice with Docker (Optional 2) +## ๐Ÿš€2. Start Microservice with Docker (Optional 2) -## 2.1 Setup on CPU +### 2.1 Setup on CPU -### 2.1.1 Build Docker Image +#### 2.1.1 Build Docker Image Build docker image with below command: @@ -56,7 +56,7 @@ cd ../../ docker build -t opea/finetuning:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg HF_TOKEN=$HF_TOKEN -f comps/finetuning/docker/Dockerfile_cpu . ``` -### 2.1.2 Run Docker with CLI +#### 2.1.2 Run Docker with CLI Start docker container with below command: @@ -64,9 +64,9 @@ Start docker container with below command: docker run -d --name="finetuning-server" -p 8015:8015 --runtime=runc --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/finetuning:latest ``` -## 2.2 Setup on Gaudi2 +### 2.2 Setup on Gaudi2 -### 2.2.1 Build Docker Image +#### 2.2.1 Build Docker Image Build docker image with below command: @@ -75,7 +75,7 @@ cd ../../ docker build -t opea/finetuning-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/finetuning/docker/Dockerfile_hpu . ``` -### 2.2.2 Run Docker with CLI +#### 2.2.2 Run Docker with CLI Start docker container with below command: @@ -84,9 +84,9 @@ export HF_TOKEN=${your_huggingface_token} docker run --runtime=habana -e HABANA_VISIBLE_DEVICES=all -p 8015:8015 -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host -e https_proxy=$https_proxy -e http_proxy=$http_proxy -e no_proxy=$no_proxy -e HF_TOKEN=$HF_TOKEN opea/finetuning-gaudi:latest ``` -# ๐Ÿš€3. Consume Finetuning Service +## ๐Ÿš€3. Consume Finetuning Service -## 3.1 Create fine-tuning job +### 3.1 Create fine-tuning job Assuming a training file `alpaca_data.json` is uploaded, it can be downloaded in [here](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json), the following script launches a finetuning job using `meta-llama/Llama-2-7b-chat-hf` as base model: @@ -120,6 +120,6 @@ curl http://${your_ip}:8015/v1/finetune/list_checkpoints -X POST -H "Content-Typ ``` -# ๐Ÿš€4. Descriptions for Finetuning parameters +## ๐Ÿš€4. Descriptions for Finetuning parameters We utilize [OpenAI finetuning parameters](https://platform.openai.com/docs/api-reference/fine-tuning) and extend it with more customizable parameters. diff --git a/comps/guardrails/toxicity_detection/README.md b/comps/guardrails/toxicity_detection/README.md index 8ef19a373..1683b0ff2 100644 --- a/comps/guardrails/toxicity_detection/README.md +++ b/comps/guardrails/toxicity_detection/README.md @@ -1,4 +1,4 @@ -# โ˜ฃ๏ธ๐Ÿ’ฅ๐Ÿ›ก๏ธToxicity Detection Microservice +# Toxicity Detection Microservice ## Introduction @@ -10,46 +10,46 @@ Toxicity is defined as rude, disrespectful, or unreasonable language likely to m - Add a RoBERTa (125M params) toxicity model fine-tuned on Gaudi2 with ToxicChat and Jigsaw dataset in an optimized serving framework. -# ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ +## ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ -## 1.1 Install Requirements +### 1.1 Install Requirements ```bash pip install -r requirements.txt ``` -## 1.2 Start Toxicity Detection Microservice with Python Script +### 1.2 Start Toxicity Detection Microservice with Python Script ```bash python toxicity_detection.py ``` -# ๐Ÿš€2. Start Microservice with Docker (Option 2) +## ๐Ÿš€2. Start Microservice with Docker (Option 2) -## 2.1 Prepare toxicity detection model +### 2.1 Prepare toxicity detection model export HUGGINGFACEHUB_API_TOKEN=${HP_TOKEN} -## 2.2 Build Docker Image +### 2.2 Build Docker Image ```bash cd ../../../ # back to GenAIComps/ folder docker build -t opea/guardrails-toxicity-detection:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/toxicity_detection/docker/Dockerfile . ``` -## 2.3 Run Docker Container with Microservice +### 2.3 Run Docker Container with Microservice ```bash docker run -d --rm --runtime=runc --name="guardrails-toxicity-detection-endpoint" -p 9091:9091 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} opea/guardrails-toxicity-detection:latest ``` -# ๐Ÿš€3. Get Status of Microservice +## ๐Ÿš€3. Get Status of Microservice ```bash docker container logs -f guardrails-toxicity-detection-endpoint ``` -# ๐Ÿš€4. Consume Microservice Pre-LLM/Post-LLM +## ๐Ÿš€4. Consume Microservice Pre-LLM/Post-LLM Once microservice starts, users can use examples (bash or python) below to apply toxicity detection for both user's query (Pre-LLM) or LLM's response (Post-LLM) diff --git a/comps/intent_detection/README.md b/comps/intent_detection/README.md index fa9062bb6..da85fdc54 100644 --- a/comps/intent_detection/README.md +++ b/comps/intent_detection/README.md @@ -1,14 +1,14 @@ # Intent Detection Microservice by TGI -# ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ +## ๐Ÿš€1. Start Microservice with Python๏ผˆOption 1๏ผ‰ -## 1.1 Install Requirements +### 1.1 Install Requirements ```bash pip install -r requirements.txt ``` -## 1.2 Start TGI Service +### 1.2 Start TGI Service ```bash export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token} @@ -18,7 +18,7 @@ export LANGCHAIN_PROJECT="opea/gen-ai-comps:llms" docker run -p 8008:80 -v ./data:/data --name tgi_service --shm-size 1g ghcr.io/huggingface/text-generation-inference:1.4 --model-id ${your_hf_llm_model} ``` -## 1.3 Verify the TGI Service +### 1.3 Verify the TGI Service ```bash curl http://${your_ip}:8008/generate \ @@ -27,7 +27,7 @@ curl http://${your_ip}:8008/generate \ -H 'Content-Type: application/json' ``` -## 1.4 Setup Environment Variables +### 1.4 Setup Environment Variables ```bash export TGI_LLM_ENDPOINT="http://${your_ip}:8008" @@ -36,7 +36,7 @@ export LANGCHAIN_API_KEY=${your_langchain_api_key} export LANGCHAIN_PROJECT="opea/intent" ``` -## 1.5 Start Intent Detection Microservice with Python Script +### 1.5 Start Intent Detection Microservice with Python Script Start intent detection microservice with below command. @@ -46,13 +46,13 @@ cp comps/intent_detection/langchain/intent_detection.py . python intent_detection.py ``` -# ๐Ÿš€2. Start Microservice with Docker (Option 2) +## ๐Ÿš€2. Start Microservice with Docker (Option 2) -## 2.1 Start TGI Service +### 2.1 Start TGI Service Please refer to 1.2. -## 2.2 Setup Environment Variables +### 2.2 Setup Environment Variables ```bash export TGI_LLM_ENDPOINT="http://${your_ip}:8008" @@ -61,20 +61,20 @@ export LANGCHAIN_API_KEY=${your_langchain_api_key} export LANGCHAIN_PROJECT="opea/intent" ``` -## 2.3 Build Docker Image +### 2.3 Build Docker Image ```bash cd /your_project_path/GenAIComps docker build --no-cache -t opea/llm-tgi:latest -f comps/intent_detection/langchain/Dockerfile . ``` -## 2.4 Run Docker with CLI (Option A) +### 2.4 Run Docker with CLI (Option A) ```bash docker run -it --name="intent-tgi-server" --net=host --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TGI_LLM_ENDPOINT=$TGI_LLM_ENDPOINT -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN opea/llm-tgi:latest ``` -## 2.5 Run with Docker Compose (Option B) +### 2.5 Run with Docker Compose (Option B) ```bash cd /your_project_path/GenAIComps/comps/intent_detection/langchain @@ -87,7 +87,7 @@ export LANGCHAIN_API_KEY=${your_langchain_api_key} docker compose -f docker_compose_intent.yaml up -d ``` -# ๐Ÿš€3. Consume Microservice +## ๐Ÿš€3. Consume Microservice Once intent detection microservice is started, user can use below command to invoke the microservice. diff --git a/comps/llms/text-generation/ollama/README.md b/comps/llms/text-generation/ollama/README.md index c2f00ee49..1333d7b41 100644 --- a/comps/llms/text-generation/ollama/README.md +++ b/comps/llms/text-generation/ollama/README.md @@ -9,7 +9,7 @@ Follow [these instructions](https://github.com/ollama/ollama) to set up and run a local Ollama instance. - Download and install Ollama onto the available supported platforms (including Windows) -- Fetch available LLM model via ollama pull . View a list of available models via the model library and pull to use locally with the command `ollama pull llama3` +- Fetch available LLM model via `ollama pull `. View a list of available models via the model library and pull to use locally with the command `ollama pull llama3` - This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model. Note: @@ -41,7 +41,7 @@ Here are a few ways to interact with pulled local models: #### In the terminal -All of your local models are automatically served on localhost:11434. Run ollama run to start interacting via the command line directly. +All of your local models are automatically served on localhost:11434. Run `ollama run ` to start interacting via the command line directly. #### API access diff --git a/comps/llms/text-generation/vllm-openvino/README.md b/comps/llms/text-generation/vllm-openvino/README.md index d26a7f569..76edc367a 100644 --- a/comps/llms/text-generation/vllm-openvino/README.md +++ b/comps/llms/text-generation/vllm-openvino/README.md @@ -17,7 +17,7 @@ Once it successfully builds, you will have the `vllm:openvino` image. It can be ## Use vLLM serving with OpenAI API -### Start The Server: +### Start The Server For gated models, such as `LLAMA-2`, you will have to pass -e HUGGING_FACE_HUB_TOKEN=\ to the docker run command above with a valid Hugging Face Hub read token. @@ -33,7 +33,7 @@ To start the model server: bash launch_model_server.sh ``` -### Request Completion With Curl: +### Request Completion With Curl ```bash curl http://localhost:8000/v1/completions \ @@ -55,7 +55,7 @@ The `launch_model_server.sh` script accepts two parameters: You have the flexibility to customize the two parameters according to your specific needs. Below is a sample reference, if you wish to specify a different model and port number -` bash launch_model_server.sh -m meta-llama/Llama-2-7b-chat-hf -p 8123` +`bash launch_model_server.sh -m meta-llama/Llama-2-7b-chat-hf -p 8123` Additionally, you can set the vLLM CPU endpoint by exporting the environment variable `vLLM_LLM_ENDPOINT`: @@ -78,5 +78,9 @@ To enable better TPOT / TTFT latency, you can use vLLM's chunked prefill feature OpenVINO best known configuration is: - $ VLLM_OPENVINO_KVCACHE_SPACE=100 VLLM_OPENVINO_CPU_KV_CACHE_PRECISION=u8 VLLM_OPENVINO_ENABLE_QUANTIZED_WEIGHTS=ON \ - python3 vllm/benchmarks/benchmark_throughput.py --model meta-llama/Llama-2-7b-chat-hf --dataset vllm/benchmarks/ShareGPT_V3_unfiltered_cleaned_split.json --enable-chunked-prefill --max-num-batched-tokens 256 +```bash +$ VLLM_OPENVINO_KVCACHE_SPACE=100 VLLM_OPENVINO_CPU_KV_CACHE_PRECISION=u8 VLLM_OPENVINO_ENABLE_QUANTIZED_WEIGHTS=ON \ + python3 vllm/benchmarks/benchmark_throughput.py --model meta-llama/Llama-2-7b-chat-hf \ + --dataset vllm/benchmarks/ShareGPT_V3_unfiltered_cleaned_split.json --enable-chunked-prefill \ + --max-num-batched-tokens 256 +``` diff --git a/comps/lvms/video-llama/README.md b/comps/lvms/video-llama/README.md index 43ec0bd18..fbdaf8d3c 100644 --- a/comps/lvms/video-llama/README.md +++ b/comps/lvms/video-llama/README.md @@ -2,9 +2,9 @@ This is a Docker-based microservice that runs Video-Llama as a Large Vision Model (LVM). It utilizes Llama-2-7b-chat-hf for conversations based on video dialogues. It support Intel Xeon CPU. -# ๐Ÿš€1. Start Microservice with Docker +## ๐Ÿš€1. Start Microservice with Docker -## 1.1 Build Images +### 1.1 Build Images ```bash cd GenAIComps @@ -14,7 +14,7 @@ docker build --no-cache -t opea/video-llama-lvm-server:latest --build-arg https_ docker build --no-cache -t opea/lvm-video-llama:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/lvms/video-llama/Dockerfile . ``` -## 1.2 Start Video-Llama and LVM Services +### 1.2 Start Video-Llama and LVM Services For the very first run, please follow below steps: @@ -42,7 +42,7 @@ services: llm_download: "False" # avoid download ``` -# โœ… 2. Test +## โœ… 2. Test ```bash # use curl @@ -58,7 +58,7 @@ export ip_address=$(hostname -I | awk '{print $1}') python comps/lvms/video-llama/check_lvm.py ``` -# โ™ป๏ธ 3. Clean +## โ™ป๏ธ 3. Clean ```bash # remove the container diff --git a/comps/prompt_registry/mongo/README.md b/comps/prompt_registry/mongo/README.md index 86baaaf27..7ed8b0295 100644 --- a/comps/prompt_registry/mongo/README.md +++ b/comps/prompt_registry/mongo/README.md @@ -34,15 +34,15 @@ docker build -t opea/promptregistry-mongo-server:latest --build-arg https_proxy= 1. Run mongoDB image -```bash -docker run -d -p 27017:27017 --name=mongo mongo:latest -``` + ```bash + docker run -d -p 27017:27017 --name=mongo mongo:latest + ``` 2. Run prompt_registry service -```bash -docker run -d --name="promptregistry-mongo-server" -p 6012:6012 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e MONGO_HOST=${MONGO_HOST} -e MONGO_PORT=${MONGO_PORT} -e DB_NAME=${DB_NAME} -e COLLECTION_NAME=${COLLECTION_NAME} opea/promptregistry-mongo-server:latest -``` + ```bash + docker run -d --name="promptregistry-mongo-server" -p 6012:6012 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e MONGO_HOST=${MONGO_HOST} -e MONGO_PORT=${MONGO_PORT} -e DB_NAME=${DB_NAME} -e COLLECTION_NAME=${COLLECTION_NAME} opea/promptregistry-mongo-server:latest + ``` ### Invoke Microservice diff --git a/comps/reranks/video-rag-qna/README.md b/comps/reranks/video-rag-qna/README.md index 9edfe4118..5b01f3e11 100644 --- a/comps/reranks/video-rag-qna/README.md +++ b/comps/reranks/video-rag-qna/README.md @@ -4,16 +4,16 @@ This is a Docker-based microservice that do result rerank for VideoRAGQnA use ca For the `VideoRAGQnA` usecase, during the data preparation phase, frames are extracted from videos and stored in a vector database. To identify the most relevant video, we count the occurrences of each video source among the retrieved data with rerank function `get_top_doc`. This sorts the video as a descending list of names, ranked by their degree of match with the query. Then we could send the `top_n` videos to the downstream LVM. -# ๐Ÿš€1. Start Microservice with Docker +## ๐Ÿš€1. Start Microservice with Docker -## 1.1 Build Images +### 1.1 Build Images ```bash cd GenAIComps docker build --no-cache -t opea/reranking-videoragqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/video-rag-qna/docker/Dockerfile . ``` -## 1.2 Start Rerank Service +### 1.2 Start Rerank Service ```bash docker compose -f comps/reranks/video-rag-qna/docker/docker_compose_reranking.yaml up -d @@ -27,7 +27,7 @@ Available configuration by environment variable: - CHUNK_DURATION: target chunk duration, should be aligned with VideoRAGQnA dataprep. Default 10s. -# โœ… 2. Test +## โœ… 2. Test ```bash export ip_address=$(hostname -I | awk '{print $1}') @@ -53,7 +53,7 @@ The result should be: {"id":"random number","video_url":"http://0.0.0.0:6005/top_video_name","chunk_start":20.0,"chunk_duration":10.0,"prompt":"this is the query","max_new_tokens":512} ``` -# โ™ป๏ธ 3. Clean +## โ™ป๏ธ 3. Clean ```bash # remove the container diff --git a/comps/retrievers/langchain/README.md b/comps/retrievers/langchain/README.md index 9d96ba14a..ebca2dbe3 100644 --- a/comps/retrievers/langchain/README.md +++ b/comps/retrievers/langchain/README.md @@ -6,18 +6,18 @@ The service primarily utilizes similarity measures in vector space to rapidly re Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial. -# Retriever Microservice with Redis +## Retriever Microservice with Redis For details, please refer to this [readme](redis/README.md) -# Retriever Microservice with Milvus +## Retriever Microservice with Milvus For details, please refer to this [readme](milvus/README.md) -# Retriever Microservice with PGVector +## Retriever Microservice with PGVector For details, please refer to this [readme](pgvector/README.md) -# Retriever Microservice with VDMS +## Retriever Microservice with VDMS For details, please refer to this [readme](vdms/README.md) diff --git a/comps/retrievers/langchain/vdms/README.md b/comps/retrievers/langchain/vdms/README.md index 108ea5203..c9ae3e5aa 100644 --- a/comps/retrievers/langchain/vdms/README.md +++ b/comps/retrievers/langchain/vdms/README.md @@ -6,7 +6,7 @@ The service primarily utilizes similarity measures in vector space to rapidly re Overall, this microservice provides robust backend support for applications requiring efficient similarity searches, playing a vital role in scenarios such as recommendation systems, information retrieval, or any other context where precise measurement of document similarity is crucial. -# Visual Data Management System (VDMS) +## Visual Data Management System (VDMS) VDMS is a storage solution for efficient access of big-โ€visualโ€-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata stored as a graph and enabling machine friendly enhancements to visual data for faster access. @@ -16,24 +16,24 @@ VDMS also supports a graph database to store different metadata(s) associated wi In Summary, VDMS supports: -K nearest neighbor search -Euclidean distance (L2) and inner product (IP) -Libraries for indexing and computing distances: TileDBDense, TileDBSparse, FaissFlat (Default), FaissIVFFlat, Flinng -Embeddings for text, images, and video -Vector and metadata searches -Scalabity to allow for definition of different relationships across the metadata +- K nearest neighbor search +- Euclidean distance (L2) and inner product (IP) +- Libraries for indexing and computing distances: TileDBDense, TileDBSparse, FaissFlat (Default), FaissIVFFlat, Flinng +- Embeddings for text, images, and video +- Vector and metadata searches +- Scalabity to allow for definition of different relationships across the metadata -# ๐Ÿš€1. Start Microservice with Python (Option 1) +## ๐Ÿš€1. Start Microservice with Python (Option 1) To start the retriever microservice, you must first install the required python packages. -## 1.1 Install Requirements +### 1.1 Install Requirements ```bash pip install -r requirements.txt ``` -## 1.2 Start TEI Service +### 1.2 Start TEI Service ```bash export LANGCHAIN_TRACING_V2=true @@ -45,7 +45,7 @@ volume=$PWD/data docker run -d -p 6060:80 -v $volume:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model --revision $revision ``` -## 1.3 Verify the TEI Service +### 1.3 Verify the TEI Service ```bash curl 127.0.0.1:6060/rerank \ @@ -54,7 +54,7 @@ curl 127.0.0.1:6060/rerank \ -H 'Content-Type: application/json' ``` -## 1.4 Setup VectorDB Service +### 1.4 Setup VectorDB Service You need to setup your own VectorDB service (VDMS in this example), and ingest your knowledge documents into the vector database. @@ -65,16 +65,16 @@ Remember to ingest data into it manually. docker run -d --name="vdms-vector-db" -p 55555:55555 intellabs/vdms:latest ``` -## 1.5 Start Retriever Service +### 1.5 Start Retriever Service ```bash export TEI_EMBEDDING_ENDPOINT="http://${your_ip}:6060" python langchain/retriever_vdms.py ``` -# ๐Ÿš€2. Start Microservice with Docker (Option 2) +## ๐Ÿš€2. Start Microservice with Docker (Option 2) -## 2.1 Setup Environment Variables +### 2.1 Setup Environment Variables ```bash export RETRIEVE_MODEL_ID="BAAI/bge-base-en-v1.5" @@ -85,7 +85,7 @@ export LANGCHAIN_API_KEY=${your_langchain_api_key} export LANGCHAIN_PROJECT="opea/retrievers" ``` -## 2.2 Build Docker Image +### 2.2 Build Docker Image ```bash cd ../../ @@ -99,22 +99,22 @@ To start a docker container, you have two options: You can choose one as needed. -## 2.3 Run Docker with CLI (Option A) +### 2.3 Run Docker with CLI (Option A) ```bash docker run -d --name="retriever-vdms-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e INDEX_NAME=$INDEX_NAME -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/retriever-vdms:latest ``` -## 2.4 Run Docker with Docker Compose (Option B) +### 2.4 Run Docker with Docker Compose (Option B) ```bash cd langchain/vdms/docker docker compose -f docker_compose_retriever.yaml up -d ``` -# ๐Ÿš€3. Consume Retriever Service +## ๐Ÿš€3. Consume Retriever Service -## 3.1 Check Service Status +### 3.1 Check Service Status ```bash curl http://localhost:7000/v1/health_check \ @@ -122,7 +122,7 @@ curl http://localhost:7000/v1/health_check \ -H 'Content-Type: application/json' ``` -## 3.2 Consume Embedding Service +### 3.2 Consume Embedding Service To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python. diff --git a/comps/vectorstores/README.md b/comps/vectorstores/README.md index 643029812..61ca0a63c 100644 --- a/comps/vectorstores/README.md +++ b/comps/vectorstores/README.md @@ -2,26 +2,26 @@ The Vectorstores Microservice provides convenient way to start various vector database servers. -# Vectorstores Microservice with Redis +## Vectorstores Microservice with Redis For details, please refer to this [readme](langchain/redis/README.md) -# Vectorstores Microservice with Qdrant +## Vectorstores Microservice with Qdrant For details, please refer to this [readme](langchain/qdrant/README.md) -# Vectorstores Microservice with PGVector +## Vectorstores Microservice with PGVector For details, please refer to this [readme](langchain/pgvector/README.md) -# Vectorstores Microservice with Pinecone +## Vectorstores Microservice with Pinecone For details, please refer to this [readme](langchain/pinecone/README.md) -# Vectorstores Microservice with Pathway +## Vectorstores Microservice with Pathway For details, please refer to this [readme](langchain/pathway/README.md) -# Vectorstores Microservice with VDMS +## Vectorstores Microservice with VDMS For details, please refer to this [readme](langchain/vdms/README.md)