Skip to content

Commit

Permalink
[Translation] Support manifests and nginx (#812)
Browse files Browse the repository at this point in the history
Signed-off-by: letonghan <[email protected]>
Signed-off-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
3 people authored Sep 18, 2024
1 parent b205dc7 commit 1e13031
Show file tree
Hide file tree
Showing 15 changed files with 1,422 additions and 37 deletions.
4 changes: 2 additions & 2 deletions .github/CODEOWNERS
100644 → 100755
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
/ChatQnA/ [email protected]
/CodeGen/ [email protected]
/CodeTrans/ [email protected]
/DocSum/ sihan.chen@intel.com
/DocSum/ letong.han@intel.com
/DocIndexRetriever/ [email protected] [email protected]
/FaqGen/ [email protected]
/SearchQnA/ letong.han@intel.com
/SearchQnA/ sihan.chen@intel.com
/Translation/ [email protected]
/VisualQnA/ [email protected]
/ProductivitySuite/ [email protected]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Deployment are based on released docker images by default, check [docker image l
| DocSum | [Xeon Instructions](DocSum/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](DocSum/docker_compose/intel/hpu/gaudi/README.md) | [DocSum with Manifests](DocSum/kubernetes/intel/README.md) | [DocSum with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/docsum/README.md) | [DocSum with GMC](DocSum/kubernetes/intel/README_gmc.md) |
| SearchQnA | [Xeon Instructions](SearchQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](SearchQnA/docker_compose/intel/hpu/gaudi/README.md) | Not Supported | Not Supported | [SearchQnA with GMC](SearchQnA/kubernetes/intel/README_gmc.md) |
| FaqGen | [Xeon Instructions](FaqGen/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](FaqGen/docker_compose/intel/hpu/gaudi/README.md) | [FaqGen with Manifests](FaqGen/kubernetes/intel/README.md) | Not Supported | [FaqGen with GMC](FaqGen/kubernetes/intel/README_gmc.md) |
| Translation | [Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md) | Not Supported | Not Supported | [Translation with GMC](Translation/kubernetes/intel/README_gmc.md) |
| Translation | [Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md) | [Translation with Manifests](Translation/kubernetes/intel/README.md) | Not Supported | [Translation with GMC](Translation/kubernetes/intel/README_gmc.md) |
| AudioQnA | [Xeon Instructions](AudioQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](AudioQnA/docker_compose/intel/hpu/gaudi/README.md) | [AudioQnA with Manifests](AudioQnA/kubernetes/intel/README.md) | Not Supported | [AudioQnA with GMC](AudioQnA/kubernetes/intel/README_gmc.md) |
| VisualQnA | [Xeon Instructions](VisualQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](VisualQnA/docker_compose/intel/hpu/gaudi/README.md) | [VisualQnA with Manifests](VisualQnA/kubernetes/intel/README.md) | Not Supported | [VisualQnA with GMC](VisualQnA/kubernetes/intel/README_gmc.md) |
| ProductivitySuite | [Xeon Instructions](ProductivitySuite/docker_compose/intel/cpu/xeon/README.md) | Not Supported | [ProductivitySuite with Manifests](ProductivitySuite/kubernetes/intel/README.md) | Not Supported | Not Supported |
Expand Down
61 changes: 49 additions & 12 deletions Translation/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,30 +41,59 @@ cd GenAIExamples/Translation/ui
docker build -t opea/translation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile .
```

### 4. Build Nginx Docker Image

```bash
cd GenAIComps
docker build -t opea/translation-nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/nginx/Dockerfile .
```

Then run the command `docker images`, you will have the following Docker Images:

1. `opea/llm-tgi:latest`
2. `opea/translation:latest`
3. `opea/translation-ui:latest`
4. `opea/translation-nginx:latest`

## 🚀 Start Microservices

### Required Models

By default, the LLM model is set to a default value as listed below:

| Service | Model |
| ------- | ----------------- |
| LLM | haoranxu/ALMA-13B |

Change the `LLM_MODEL_ID` below for your needs.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to set up them in advance as below.
1. Set the required environment variables:

```bash
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="haoranxu/ALMA-13B"
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
```
```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
# Example: NGINX_PORT=80
export NGINX_PORT=${your_nginx_port}
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
```

3. Set up other environment variables:

Note: Please replace with `host_ip` with you external IP address, do not use localhost.
```bash
cd ../../../
source set_env.sh
```

### Start Microservice Docker Containers

Expand Down Expand Up @@ -99,6 +128,14 @@ docker compose up -d
"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

4. Nginx Service

```bash
curl http://${host_ip}:${NGINX_PORT}/v1/translation \
-H "Content-Type: application/json" \
-d '{"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.

## 🚀 Launch the UI
Expand Down
30 changes: 28 additions & 2 deletions Translation/docker_compose/intel/cpu/xeon/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ services:
ports:
- "8008:80"
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
volumes:
- "./data:/data"
shm_size: 1g
Expand All @@ -25,10 +27,13 @@ services:
- "9000:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
restart: unless-stopped
translation-xeon-backend-server:
image: ${REGISTRY:-opea}/translation:${TAG:-latest}
Expand All @@ -39,6 +44,7 @@ services:
ports:
- "8888:8888"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
Expand All @@ -53,11 +59,31 @@ services:
ports:
- "5173:5173"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- BASE_URL=${BACKEND_SERVICE_ENDPOINT}
ipc: host
restart: always
translation-xeon-nginx-server:
image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
container_name: translation-xeon-nginx-server
depends_on:
- translation-xeon-backend-server
- translation-xeon-ui-server
ports:
- "${NGINX_PORT:-80}:80"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- FRONTEND_SERVICE_IP=${FRONTEND_SERVICE_IP}
- FRONTEND_SERVICE_PORT=${FRONTEND_SERVICE_PORT}
- BACKEND_SERVICE_NAME=${BACKEND_SERVICE_NAME}
- BACKEND_SERVICE_IP=${BACKEND_SERVICE_IP}
- BACKEND_SERVICE_PORT=${BACKEND_SERVICE_PORT}
ipc: host
restart: always
networks:
default:
driver: bridge
63 changes: 50 additions & 13 deletions Translation/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,34 +29,63 @@ docker build -t opea/translation:latest --build-arg https_proxy=$https_proxy --b
Construct the frontend Docker image using the command below:

```bash
cd GenAIExamples/Translation
cd GenAIExamples/Translation/ui/
docker build -t opea/translation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
```

### 4. Build Nginx Docker Image

```bash
cd GenAIComps
docker build -t opea/translation-nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/nginx/Dockerfile .
```

Then run the command `docker images`, you will have the following four Docker Images:

1. `opea/llm-tgi:latest`
2. `opea/translation:latest`
3. `opea/translation-ui:latest`
4. `opea/translation-nginx:latest`

## 🚀 Start Microservices

### Required Models

By default, the LLM model is set to a default value as listed below:

| Service | Model |
| ------- | ----------------- |
| LLM | haoranxu/ALMA-13B |

Change the `LLM_MODEL_ID` below for your needs.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
1. Set the required environment variables:

```bash
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="haoranxu/ALMA-13B"
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
```
```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
# Example: NGINX_PORT=80
export NGINX_PORT=${your_nginx_port}
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
```

3. Set up other environment variables:

Note: Please replace with `host_ip` with you external IP address, do not use localhost.
```bash
cd ../../../
source set_env.sh
```

### Start Microservice Docker Containers

Expand Down Expand Up @@ -91,6 +120,14 @@ docker compose up -d
"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

4. Nginx Service

```bash
curl http://${host_ip}:${NGINX_PORT}/v1/translation \
-H "Content-Type: application/json" \
-d '{"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```

Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.

## 🚀 Launch the UI
Expand Down
22 changes: 21 additions & 1 deletion Translation/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ services:
environment:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
Expand All @@ -36,6 +35,8 @@ services:
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
restart: unless-stopped
translation-gaudi-backend-server:
image: ${REGISTRY:-opea}/translation:${TAG:-latest}
Expand Down Expand Up @@ -65,6 +66,25 @@ services:
- BASE_URL=${BACKEND_SERVICE_ENDPOINT}
ipc: host
restart: always
translation-gaudi-nginx-server:
image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
container_name: translation-gaudi-nginx-server
depends_on:
- translation-gaudi-backend-server
- translation-gaudi-ui-server
ports:
- "${NGINX_PORT:-80}:80"
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- FRONTEND_SERVICE_IP=${FRONTEND_SERVICE_IP}
- FRONTEND_SERVICE_PORT=${FRONTEND_SERVICE_PORT}
- BACKEND_SERVICE_NAME=${BACKEND_SERVICE_NAME}
- BACKEND_SERVICE_IP=${BACKEND_SERVICE_IP}
- BACKEND_SERVICE_PORT=${BACKEND_SERVICE_PORT}
ipc: host
restart: always

networks:
default:
Expand Down
18 changes: 18 additions & 0 deletions Translation/docker_compose/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0


export LLM_MODEL_ID="haoranxu/ALMA-13B"
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
export NGINX_PORT=80
export FRONTEND_SERVICE_IP=${host_ip}
export FRONTEND_SERVICE_PORT=5173
export BACKEND_SERVICE_NAME=translation
export BACKEND_SERVICE_IP=${host_ip}
export BACKEND_SERVICE_PORT=8888
6 changes: 6 additions & 0 deletions Translation/docker_image_build/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,9 @@ services:
dockerfile: comps/llms/text-generation/tgi/Dockerfile
extends: translation
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
nginx:
build:
context: GenAIComps
dockerfile: comps/nginx/Dockerfile
extends: translation
image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
41 changes: 41 additions & 0 deletions Translation/kubernetes/intel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Deploy Translation in Kubernetes Cluster

> [NOTE]
> The following values must be set before you can deploy:
> HUGGINGFACEHUB_API_TOKEN
>
> You can also customize the "MODEL_ID" if needed.
>
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the Translation workload is running. Otherwise, you need to modify the `translation.yaml` file to change the `model-volume` to a directory that exists on the node.
## Deploy On Xeon

```
cd GenAIExamples/Translation/kubernetes/intel/cpu/xeon/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
kubectl apply -f translation.yaml
```

## Deploy On Gaudi

```
cd GenAIExamples/Translation/kubernetes/intel/hpu/gaudi/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
kubectl apply -f translation.yaml
```

## Verify Services

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/translation 8888:8888` to expose the Translation service for access.

Open another terminal and run the following command to verify the service if working:

```console
curl http://localhost:8888/v1/translation \
-H 'Content-Type: application/json' \
-d '{"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
```
Loading

0 comments on commit 1e13031

Please sign in to comment.