-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add some NVIDIA platform support docs and scripts
* Add a README for chatqna deployment on NV GPU * Add some scripts for easier deployment, which includes GMC installation and uninstallation, chatqna pipeline deployment on NVIDIA GPU platform. Signed-off-by: PeterYang12 <[email protected]>
- Loading branch information
1 parent
b1182c4
commit 8e5bd05
Showing
5 changed files
with
142 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# QuickSatrt Guide | ||
Ver: 1.0 | ||
Last Update: 2024-Aug-21 | ||
Author: [PeterYang12](https://github.com/PeterYang12) | ||
E-mail: [email protected] | ||
|
||
This document is a quickstart guide for GenAIInfra deployment and test on NVIDIA GPU platform. | ||
|
||
## Prerequisite | ||
|
||
GenAIInfra uses Kubernetes as the cloud native infrastructure. Please follow the steps below to prepare the Kubernetes environment. | ||
|
||
#### Setup Kubernetes cluster | ||
|
||
Please follow [Kubernetes official setup guide](https://github.com/opea-project/GenAIInfra?tab=readme-ov-file#setup-kubernetes-cluster) to setup Kubernetes. We recommend to use Kubernetes with version >= 1.27. | ||
|
||
|
||
#### To run GenAIInfra on NVIDIA GPUs | ||
|
||
To run the workloads on NVIDIA GPUs, please follow the steps. | ||
|
||
1. Please check the [support matrix](https://docs.nvidia.com/ai-enterprise/latest/product-support-matrix/index.html) to make sure that environment meets the requirements. | ||
|
||
2. [Install the NVIDIA GPU CUDA driver and software stack](https://developer.nvidia.com/cuda-downloads). | ||
|
||
3. [Install the NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) | ||
|
||
4. [Install the NVIDIA GPU device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin). | ||
|
||
5. [Install helm](https://helm.sh/docs/intro/install/) | ||
|
||
NOTE: Please make sure you configure the appropriate container runtime based on the type of container runtime you installed during Kubernetes setup. | ||
|
||
## Usages | ||
|
||
#### Use GenAI Microservices Connector (GMC) to deploy and adjust GenAIExamples on NVIDIA GPUs | ||
|
||
|
||
#### 1. Install the GMC Helm Chart | ||
***NOTE***: Before installingGMC, please export your own huggingface tokens, Google API KEY and Google CSE ID. If you have pre-defined directory to save the models on you cluster hosts, please also set the path. | ||
``` | ||
export YOUR_HF_TOKEN=<your hugging facetoken> | ||
export YOUR_GOOGLE_API_KEY=<your google api key> | ||
export YOUR_GOOGLE_CSE_ID=<your google cse id> | ||
export MOUNT_DIR=<your model path> | ||
``` | ||
|
||
Here also provides a simple way to install GMC using helm chart `./install-gmc.sh` | ||
> WARNING: the install-gmc.sh may fail due to OS distributions. | ||
For more details, please refer to [GMC installation](https://github.com/opea-project/GenAIInfra/blob/main/microservices-connector/README.md) to get more details. | ||
|
||
#### 2.Use GMC to compose a ChatQnA Pipeline | ||
Please refer to [Usage guide for GMC](https://github.com/opea-project/GenAIInfra/blob/main/microservices-connector/usage_guide.md) for more details. | ||
|
||
Here provides a simple script to use GMC to compose ChatQnA pipeline. | ||
|
||
|
||
#### 3. Test ChatQnA service | ||
Please refer to [GMC ChatQnA test](https://github.com/opea-project/GenAIInfra/blob/main/microservices-connector/usage_guide.md#use-gmc-to-compose-a-chatqna-pipeline) | ||
Here provides a simple way to test the service. `./gmc-chatqna-test.sh` | ||
|
||
#### 4. Delete ChatQnA and GMC | ||
``` | ||
kubectl delete ns chatqa | ||
./delete-gmc.sh | ||
``` | ||
|
||
|
||
## FAQ and Troubleshooting | ||
The scripts are only tested on baremental **Ubuntu22.04** with **NVIDIA H100**. Please report an issue if you meet any issue. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/usr/bin/env bash | ||
set -e | ||
|
||
SCRIPT_DIR=$(cd $(dirname "${BASH_SOURCE[0]}") && pwd) | ||
cd $SCRIPT_DIR && cd ../../ | ||
GenAIInfra_DIR=$(pwd) | ||
cd $GenAIInfra_DIR/microservices-connector | ||
|
||
# kubectl delete -k config/samples/ | ||
helm delete -n system gmc | ||
kubectl delete crd gmconnectors.gmc.opea.io |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#!/usr/bin/env bash | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
set -e | ||
|
||
SCRIPT_DIR=$(cd $(dirname "${BASH_SOURCE[0]}") && pwd) | ||
cd $SCRIPT_DIR && cd ../../ | ||
GenAIInfra_DIR=$(pwd) | ||
cd $GenAIInfra_DIR/microservices-connector/ | ||
|
||
# TODO: to support more examples | ||
kubectl create ns chatqa | ||
kubectl apply -f $(pwd)/config/samples/chatQnA_nv.yaml | ||
|
||
sleep 2 | ||
kubectl get service -n chatqa | ||
kubectl create deployment client-test -n chatqa --image=python:3.8.13 -- sleep infinity |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#!/usr/bin/env bash | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
set -e | ||
|
||
CLIENT_POD=$(kubectl get pod -n chatqa -l app=client-test -o jsonpath={.items..metadata.name}) | ||
accessUrl=$(kubectl get gmc -n chatqa -o jsonpath="{.items[?(@.metadata.name=='chatqa')].status.accessUrl}") | ||
|
||
kubectl exec "$CLIENT_POD" -n chatqa -- curl $accessUrl -X POST -d '{"text":"What is the revenue of Nike in 2023?","parameters":{"max_new_tokens":17, "do_sample": true}}' -H 'Content-Type: application/json' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#!/usr/bin/env bash | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
set -e | ||
|
||
SCRIPT_DIR=$(cd $(dirname "${BASH_SOURCE[0]}") && pwd) | ||
cd $SCRIPT_DIR && cd ../../ | ||
GenAIInfra_DIR=$(pwd) | ||
cd $GenAIInfra_DIR/microservices-connector/helm | ||
|
||
if [ -n "$YOUR_HF_TOKEN" ]; then | ||
find manifests_common/ -name '*.yaml' -type f -exec sed -i "s#insert-your-huggingface-token-here#$YOUR_HF_TOKEN#g" {} \; | ||
fi | ||
|
||
if [ -n "$YOUR_GOOGLE_API_KEY" ]; then | ||
find manifests_common/ -name '*.yaml' -type f -exec sed -i "s#GOOGLE_API_KEY:.*#GOOGLE_API_KEY: "$YOUR_GOOGLE_API_KEY"#g" {} \; | ||
fi | ||
|
||
if [ -n "$YOUR_GOOGLE_CSE_ID" ]; then | ||
find manifests_common/ -name '*.yaml' -type f -exec sed -i "s#GOOGLE_CSE_ID:.*#GOOGLE_CSE_ID: "$YOUR_GOOGLE_CSE_ID"#g" {} \; | ||
fi | ||
|
||
|
||
if [ -n "$MOUNT_DIR" ]; then | ||
find manifests_common/ -name '*.yaml' -type f -exec sed -i "s#path: /mnt/opea-models#path: $MOUNT_DIR#g" {} \; | ||
fi | ||
|
||
# install GMC helm chart | ||
helm install -n system --create-namespace gmc . | ||
sleep 2 | ||
kubectl get pod -n system |