Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update tgi with text-generation-inference:2.1.0 #273

Merged
merged 24 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
9109577
update text-generation-inference:2.1.0
chensuyue Jul 4, 2024
6d73a9f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 4, 2024
dff0a9a
add overall system prune
chensuyue Jul 4, 2024
634b583
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 4, 2024
5a61adb
Merge branch 'main' into suyue/tgi
chensuyue Jul 4, 2024
a6cd5b2
update model name
chensuyue Jul 5, 2024
6a42265
fix test scripts name
chensuyue Jul 5, 2024
3166db9
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 5, 2024
5684988
Merge branch 'main' into suyue/tgi
chensuyue Jul 5, 2024
f7bc29e
skip one model
chensuyue Jul 5, 2024
8d11947
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 5, 2024
f81284d
retest after update hf token
chensuyue Jul 5, 2024
4b6be62
Merge branch 'main' into suyue/tgi
chensuyue Jul 5, 2024
6edb929
for test
chensuyue Jul 5, 2024
5a41ab4
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 5, 2024
55ed322
bug fix
chensuyue Jul 9, 2024
68c29e2
add more model for test
chensuyue Jul 9, 2024
d8e66da
Merge branch 'main' into suyue/tgi
chensuyue Jul 9, 2024
1924f7f
update readme
chensuyue Jul 9, 2024
73e201d
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 9, 2024
310335e
Merge branch 'main' into suyue/tgi
chensuyue Jul 9, 2024
aa56a5e
test after fix token
chensuyue Jul 9, 2024
1ac20f0
Merge branch 'suyue/tgi' of https://github.com/opea-project/GenAIComp…
chensuyue Jul 9, 2024
fdc1612
add model for test
chensuyue Jul 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions comps/llms/text-generation/tgi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ export HF_TOKEN=${your_hf_api_token}
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=${your_langchain_api_key}
export LANGCHAIN_PROJECT="opea/gen-ai-comps:llms"
docker run -p 8008:80 -v ./data:/data --name tgi_service --shm-size 1g ghcr.io/huggingface/text-generation-inference:1.4 --model-id ${your_hf_llm_model}
docker run -p 8008:80 -v ./data:/data --name tgi_service --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id ${your_hf_llm_model}
```

## 1.3 Verify the TGI Service
Expand Down Expand Up @@ -114,11 +114,11 @@ curl http://${your_ip}:9000/v1/chat/completions \

## 4. Validated Model

| Model | TGI-Gaudi |
| ------------------------- | --------- |
| Intel/neural-chat-7b-v3-3 | ✓ |
| Llama-2-7b-chat-hf | ✓ |
| Llama-2-70b-chat-hf | ✓ |
| Meta-Llama-3-8B-Instruct | ✓ |
| Meta-Llama-3-70B-Instruct | ✓ |
| Phi-3 | x |
| Model | TGI |
| ------------------------- | --- |
| Intel/neural-chat-7b-v3-3 | ✓ |
| Llama-2-7b-chat-hf | ✓ |
| Llama-2-70b-chat-hf | ✓ |
| Meta-Llama-3-8B-Instruct | ✓ |
| Meta-Llama-3-70B-Instruct | ✓ |
| Phi-3 | x |
chensuyue marked this conversation as resolved.
Show resolved Hide resolved
9 changes: 0 additions & 9 deletions comps/llms/text-generation/tgi/build_docker.sh

This file was deleted.

2 changes: 1 addition & 1 deletion comps/llms/text-generation/tgi/docker_compose_llm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.8"

services:
tgi_service:
image: ghcr.io/huggingface/text-generation-inference:1.4
image: ghcr.io/huggingface/text-generation-inference:2.1.0
container_name: tgi-service
ports:
- "8008:80"
Expand Down
19 changes: 13 additions & 6 deletions tests/test_llms_text-generation_tgi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ function build_docker_images() {

function start_service() {
tgi_endpoint_port=5004
export your_hf_llm_model="Intel/neural-chat-7b-v3-3"
export your_hf_llm_model=$1
# Remember to set HF_TOKEN before invoking this test!
export HF_TOKEN=${HF_TOKEN}
docker run -d --name="test-comps-llm-tgi-endpoint" -p $tgi_endpoint_port:80 -v ./data:/data --shm-size 1g ghcr.io/huggingface/text-generation-inference:1.4 --model-id ${your_hf_llm_model}
docker run -d --name="test-comps-llm-tgi-endpoint" -p $tgi_endpoint_port:80 -v ./data:/data --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id ${your_hf_llm_model}
export TGI_LLM_ENDPOINT="http://${ip_address}:${tgi_endpoint_port}"

tei_service_port=5005
Expand Down Expand Up @@ -55,13 +55,20 @@ function stop_docker() {
function main() {

stop_docker

build_docker_images
start_service

validate_microservice
llm_models=(
Intel/neural-chat-7b-v3-3
Llama-2-7b-chat-hf
Meta-Llama-3-8B-Instruct
Phi-3
)
for model in "${llm_models[@]}"; do
start_service "${model}"
validate_microservice
stop_docker
done

stop_docker
echo y | docker system prune

}
Expand Down
Loading