Update Kubernetes manifest files for deploying ChatQnA (opea-project#445

) Update Kubernetes manifest files for deploying ChatQnA without GMC. Signed-off-by: Lianhao Lu <[email protected]>
zehao-intel · Jul 24, 2024 · 665c46f · 665c46f
1 parent 6e797fa
commit 665c46f
Show file tree

Hide file tree

Showing 25 changed files with 2,283 additions and 901 deletions.
diff --git a/ChatQnA/README.md b/ChatQnA/README.md
@@ -105,9 +105,13 @@ docker compose -f docker_compose.yaml up -d
 
 Refer to the [NVIDIA GPU Guide](./docker/gpu/README.md) for more instructions on building docker images from source.
 
-## Deploy ChatQnA into Kubernetes on Xeon & Gaudi
+## Deploy ChatQnA into Kubernetes on Xeon & Gaudi with GMC
 
-Refer to the [Kubernetes Guide](./kubernetes/manifests/README.md) for instructions on deploying ChatQnA into Kubernetes on Xeon & Gaudi.
+Refer to the [Kubernetes Guide](./kubernetes/README.md) for instructions on deploying ChatQnA into Kubernetes on Xeon & Gaudi with GMC.
+
+## Deploy ChatQnA into Kubernetes on Xeon & Gaudi without GMC
+
+Refer to the [Kubernetes Guide](./kubernetes/manifests/README.md) for instructions on deploying ChatQnA into Kubernetes on Xeon & Gaudi without GMC.
 
 ## Deploy ChatQnA into Kubernetes using Helm Chart
 

diff --git a/ChatQnA/kubernetes/manifests/README.md b/ChatQnA/kubernetes/manifests/README.md
@@ -0,0 +1,41 @@
+<h1 align="center" id="title">Deploy ChatQnA in Kubernetes Cluster</h1>
+
+> [NOTE]
+> The following values must be set before you can deploy:
+> HUGGINGFACEHUB_API_TOKEN
+
+> You can also customize the "MODEL_ID" if needed.
+
+> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the ChatQnA workload is running. Otherwise, you need to modify the `chatqna.yaml` file to change the `model-volume` to a directory that exists on the node.
+
+## Deploy On Xeon
+
+```
+cd GenAIExamples/ChatQnA/kubernetes/manifests/xeon
+export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" chatqna.yaml
+kubectl apply -f chatqna.yaml
+```
+
+## Deploy On Gaudi
+
+```
+cd GenAIExamples/ChatQnA/kubernetes/manifests/gaudi
+export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" chatqna.yaml
+kubectl apply -f chatqna.yaml
+```
+
+## Verify Services
+
+To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
+
+Then run the command `kubectl port-forward svc/chatqna 8888:8888` to expose the ChatQnA service for access.
+
+Open another terminal and run the following command to verify the service if working:
+
+```console
+curl http://localhost:8888/v1/chatqna \
+    -H 'Content-Type: application/json' \
+    -d '{"messages": "What is the revenue of Nike in 2023?"}'
+```
diff --git a/ChatQnA/kubernetes/manifests/chaqna-xeon-backend-server.yaml b/ChatQnA/kubernetes/manifests/chaqna-xeon-backend-server.yaml
diff --git a/ChatQnA/kubernetes/manifests/docsum_gaudi_llm.yaml b/ChatQnA/kubernetes/manifests/docsum_gaudi_llm.yaml
diff --git a/ChatQnA/kubernetes/manifests/docsum_llm.yaml b/ChatQnA/kubernetes/manifests/docsum_llm.yaml
diff --git a/ChatQnA/kubernetes/manifests/embedding.yaml b/ChatQnA/kubernetes/manifests/embedding.yaml