Skip to content

Commit

Permalink
Upgrade TGI Gaudi version to v2.0.6 (opea-project#1088)
Browse files Browse the repository at this point in the history
Signed-off-by: lvliang-intel <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
  • Loading branch information
lvliang-intel and chensuyue authored Nov 12, 2024
1 parent f7a7f8a commit 1ff85f6
Show file tree
Hide file tree
Showing 74 changed files with 94 additions and 85 deletions.
2 changes: 1 addition & 1 deletion AgentQnA/docker_compose/intel/hpu/gaudi/tgi_gaudi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

services:
tgi-server:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-server
ports:
- "8085:80"
Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ services:
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "3006:80"
Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/kubernetes/intel/README_gmc.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The AudioQnA uses the below prebuilt images if you choose a Xeon deployment
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.
For Gaudi:

- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.5
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.6
- whisper-gaudi: opea/whisper-gaudi:latest
- speecht5-gaudi: opea/speecht5-gaudi:latest

Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/kubernetes/intel/hpu/gaudi/manifest/audioqna.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ spec:
- envFrom:
- configMapRef:
name: audio-qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
name: llm-dependency-deploy-demo
securityContext:
capabilities:
Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/tests/test_compose_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ function build_docker_images() {
service_list="audioqna whisper-gaudi asr llm-tgi speecht5-gaudi tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker images && sleep 1s
}

Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/tests/test_compose_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ function build_docker_images() {
service_list="audioqna whisper asr llm-tgi speecht5 tts"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker images && sleep 1s
}

Expand Down
2 changes: 1 addition & 1 deletion AvatarChatbot/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ services:
environment:
TTS_ENDPOINT: ${TTS_ENDPOINT}
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "3006:80"
Expand Down
2 changes: 1 addition & 1 deletion AvatarChatbot/tests/test_compose_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ function build_docker_images() {
service_list="avatarchatbot whisper-gaudi asr llm-tgi speecht5-gaudi tts wav2lip-gaudi animation"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6

docker images && sleep 1s
}
Expand Down
2 changes: 1 addition & 1 deletion AvatarChatbot/tests/test_compose_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ function build_docker_images() {
service_list="avatarchatbot whisper asr llm-tgi speecht5 tts wav2lip animation"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6

docker images && sleep 1s
}
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/benchmark/accuracy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ To setup a LLM model, we can use [tgi-gaudi](https://github.com/huggingface/tgi-
docker run -p {your_llm_port}:80 --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HF_TOKEN={your_hf_token} --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id mistralai/Mixtral-8x7B-Instruct-v0.1 --max-input-tokens 2048 --max-total-tokens 4096 --sharded true --num-shard 2
# for better performance, set `PREFILL_BATCH_BUCKET_SIZE`, `BATCH_BUCKET_SIZE`, `max-batch-total-tokens`, `max-batch-prefill-tokens`
docker run -p {your_llm_port}:80 --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HF_TOKEN={your_hf_token} -e PREFILL_BATCH_BUCKET_SIZE=1 -e BATCH_BUCKET_SIZE=8 --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id mistralai/Mixtral-8x7B-Instruct-v0.1 --max-input-tokens 2048 --max-total-tokens 4096 --sharded true --num-shard 2 --max-batch-total-tokens 65536 --max-batch-prefill-tokens 2048
docker run -p {your_llm_port}:80 --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HF_TOKEN={your_hf_token} -e PREFILL_BATCH_BUCKET_SIZE=1 -e BATCH_BUCKET_SIZE=8 --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.6 --model-id mistralai/Mixtral-8x7B-Instruct-v0.1 --max-input-tokens 2048 --max-total-tokens 4096 --sharded true --num-shard 2 --max-batch-total-tokens 65536 --max-batch-prefill-tokens 2048
```

### Prepare Dataset
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ spec:
envFrom:
- configMapRef:
name: qna-config
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
imagePullPolicy: IfNotPresent
name: llm-dependency-deploy
ports:
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/chatqna.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ opea_micro_services:
tgi-service:
host: ${TGI_SERVICE_IP}
ports: ${TGI_SERVICE_PORT}
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
volumes:
- "./data:/data"
runtime: habana
Expand Down
4 changes: 2 additions & 2 deletions ChatQnA/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ For users in China who are unable to download models directly from Huggingface,
export HF_TOKEN=${your_hf_token}
export HF_ENDPOINT="https://hf-mirror.com"
model_name="Intel/neural-chat-7b-v3-3"
docker run -p 8008:80 -v ./data:/data --name tgi-service -e HF_ENDPOINT=$HF_ENDPOINT -e http_proxy=$http_proxy -e https_proxy=$https_proxy --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e ENABLE_HPU_GRAPH=true -e LIMIT_HPU_GRAPH=true -e USE_FLASH_ATTENTION=true -e FLASH_ATTENTION_RECOMPUTE=true --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id $model_name --max-input-tokens 1024 --max-total-tokens 2048
docker run -p 8008:80 -v ./data:/data --name tgi-service -e HF_ENDPOINT=$HF_ENDPOINT -e http_proxy=$http_proxy -e https_proxy=$https_proxy --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e ENABLE_HPU_GRAPH=true -e LIMIT_HPU_GRAPH=true -e USE_FLASH_ATTENTION=true -e FLASH_ATTENTION_RECOMPUTE=true --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.6 --model-id $model_name --max-input-tokens 1024 --max-total-tokens 2048
```

2. Offline
Expand All @@ -206,7 +206,7 @@ For users in China who are unable to download models directly from Huggingface,
```bash
export HF_TOKEN=${your_hf_token}
export model_path="/path/to/model"
docker run -p 8008:80 -v $model_path:/data --name tgi_service --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e ENABLE_HPU_GRAPH=true -e LIMIT_HPU_GRAPH=true -e USE_FLASH_ATTENTION=true -e FLASH_ATTENTION_RECOMPUTE=true --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id /data --max-input-tokens 1024 --max-total-tokens 2048
docker run -p 8008:80 -v $model_path:/data --name tgi_service --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e ENABLE_HPU_GRAPH=true -e LIMIT_HPU_GRAPH=true -e USE_FLASH_ATTENTION=true -e FLASH_ATTENTION_RECOMPUTE=true --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.6 --model-id /data --max-input-tokens 1024 --max-total-tokens 2048
```

### Setup Environment Variables
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/docker_compose/intel/hpu/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ services:
MAX_WARMUP_SEQUENCE_LENGTH: 512
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "8005:80"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ services:
TEI_ENDPOINT: http://tei-embedding-service:80
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
tgi-guardrails-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-guardrails-server
ports:
- "8088:80"
Expand Down Expand Up @@ -117,7 +117,7 @@ services:
MAX_WARMUP_SEQUENCE_LENGTH: 512
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "8008:80"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ services:
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
restart: unless-stopped
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "8005:80"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,16 @@ f810f3b4d329 opea/embedding-tei:latest "python e
2fa17d84605f opea/dataprep-redis:latest "python prepare_doc_…" 2 minutes ago Up 2 minutes 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server
69e1fb59e92c opea/retriever-redis:latest "/home/user/comps/re…" 2 minutes ago Up 2 minutes 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server
313b9d14928a opea/reranking-tei:latest "python reranking_te…" 2 minutes ago Up 2 minutes 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp reranking-tei-gaudi-server
05c40b636239 ghcr.io/huggingface/tgi-gaudi:2.0.5 "text-generation-lau…" 2 minutes ago Exited (1) About a minute ago tgi-gaudi-server
05c40b636239 ghcr.io/huggingface/tgi-gaudi:2.0.6 "text-generation-lau…" 2 minutes ago Exited (1) About a minute ago tgi-gaudi-server
174bd43fa6b5 ghcr.io/huggingface/tei-gaudi:latest "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server
74084469aa33 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db
88399dbc9e43 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8808->80/tcp, :::8808->80/tcp tei-reranking-gaudi-server
```

In this case, `ghcr.io/huggingface/tgi-gaudi:2.0.5` Existed.
In this case, `ghcr.io/huggingface/tgi-gaudi:2.0.6` Existed.

```
05c40b636239 ghcr.io/huggingface/tgi-gaudi:2.0.5 "text-generation-lau…" 2 minutes ago Exited (1) About a minute ago tgi-gaudi-server
05c40b636239 ghcr.io/huggingface/tgi-gaudi:2.0.6 "text-generation-lau…" 2 minutes ago Exited (1) About a minute ago tgi-gaudi-server
```

Next we can check the container logs to get to know what happened during the docker start.
Expand All @@ -68,7 +68,7 @@ Check the log of container by:

`docker logs <CONTAINER ID> -t`

View the logs of `ghcr.io/huggingface/tgi-gaudi:2.0.5`
View the logs of `ghcr.io/huggingface/tgi-gaudi:2.0.6`

`docker logs 05c40b636239 -t`

Expand Down Expand Up @@ -97,7 +97,7 @@ So just make sure the devices are available.
Here is another failure example:

```
f7a08f9867f9 ghcr.io/huggingface/tgi-gaudi:2.0.5 "text-generation-lau…" 16 seconds ago Exited (2) 14 seconds ago tgi-gaudi-server
f7a08f9867f9 ghcr.io/huggingface/tgi-gaudi:2.0.6 "text-generation-lau…" 16 seconds ago Exited (2) 14 seconds ago tgi-gaudi-server
```

Check the log by `docker logs f7a08f9867f9 -t`.
Expand All @@ -114,7 +114,7 @@ View the docker input parameters in `./ChatQnA/docker_compose/intel/hpu/gaudi/co

```
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
ports:
- "8008:80"
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/kubernetes/intel/README_gmc.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Should you desire to use the Gaudi accelerator, two alternate images are used fo
For Gaudi:

- tei-embedding-service: ghcr.io/huggingface/tei-gaudi:latest
- tgi-service: gghcr.io/huggingface/tgi-gaudi:2.0.5
- tgi-service: gghcr.io/huggingface/tgi-gaudi:2.0.6

> [NOTE]
> Please refer to [Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker_compose/intel/cpu/xeon/README.md) or [Gaudi README](https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker_compose/intel/hpu/gaudi/README.md) to build the OPEA images. These too will be available on Docker Hub soon to simplify use.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1103,7 +1103,7 @@ spec:
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "ghcr.io/huggingface/tgi-gaudi:2.0.5"
image: "ghcr.io/huggingface/tgi-gaudi:2.0.6"
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
Expand Down Expand Up @@ -1184,8 +1184,13 @@ spec:
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
<<<<<<< HEAD
image: "ghcr.io/huggingface/tgi-gaudi:2.0.6"
imagePullPolicy: IfNotPresent
=======
image: "ghcr.io/huggingface/tgi-gaudi:2.0.5"
imagePullPolicy: Always
>>>>>>> e3187be819ad088c24bf1b2cbb419255af0f2be3
volumeMounts:
- mountPath: /data
name: model-volume
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -924,7 +924,7 @@ spec:
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "ghcr.io/huggingface/tgi-gaudi:2.0.5"
image: "ghcr.io/huggingface/tgi-gaudi:2.0.6"
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/tests/test_compose_guardrails_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ function build_docker_images() {
service_list="chatqna-guardrails chatqna-ui dataprep-redis retriever-redis guardrails-tgi nginx"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log

docker pull ghcr.io/huggingface/tgi-gaudi:2.0.5
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.6
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
docker pull ghcr.io/huggingface/tei-gaudi:latest

Expand Down
Loading

0 comments on commit 1ff85f6

Please sign in to comment.