Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update v0.9 RAG release data #747

Merged
merged 4 commits into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/_get-test-matrix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
run_hardware=""
if [ $(printf '%s\n' "${changed_files[@]}" | grep ${example} | grep -c gaudi) != 0 ]; then run_hardware="gaudi"; fi
if [ $(printf '%s\n' "${changed_files[@]}" | grep ${example} | grep -c xeon) != 0 ]; then run_hardware="xeon ${run_hardware}"; fi
if [ "$run_hardware" == "" ]; then run_hardware="gaudi"; fi
if [ "$run_hardware" == "" ]; then run_hardware="xeon gaudi"; fi
for hw in ${run_hardware}; do
if [ "$hw" == "gaudi" ] && [ "${{ inputs.gaudi_server_label }}" != "" ]; then
run_matrix="${run_matrix}{\"example\":\"${example}\",\"hardware\":\"${{ inputs.gaudi_server_label }}\"},"
Expand Down
20 changes: 12 additions & 8 deletions ChatQnA/benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,11 +133,11 @@ kubectl label nodes k8s-worker1 node-type=chatqna-opea

#### 2. Install ChatQnA

Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/single_gaudi) and apply to K8s.
Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/single_gaudi) and apply to K8s.

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/single_gaudi
cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/single_gaudi
kubectl apply -f .
```

Expand Down Expand Up @@ -199,7 +199,7 @@ All the test results will come to this folder `/home/sdp/benchmark_output/node_1

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/single_gaudi
cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/single_gaudi
kubectl delete -f .
kubectl label nodes k8s-worker1 node-type-
```
Expand All @@ -216,11 +216,11 @@ kubectl label nodes k8s-worker1 k8s-worker2 node-type=chatqna-opea

#### 2. Install ChatQnA

Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/two_gaudi) and apply to K8s.
Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/two_gaudi) and apply to K8s.

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/two_gaudi
cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/two_gaudi
kubectl apply -f .
```

Expand Down Expand Up @@ -265,11 +265,11 @@ kubectl label nodes k8s-master k8s-worker1 k8s-worker2 k8s-worker3 node-type=cha

#### 2. Install ChatQnA

Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/four_gaudi) and apply to K8s.
Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/four_gaudi) and apply to K8s.

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/four_gaudi
cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/four_gaudi
kubectl apply -f .
```

Expand Down Expand Up @@ -298,7 +298,11 @@ All the test results will come to this folder `/home/sdp/benchmark_output/node_4

```bash
# on k8s-master node
cd GenAIExamples/ChatQnA/benchmark/single_gaudi
cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/single_gaudi
kubectl delete -f .
kubectl label nodes k8s-master k8s-worker1 k8s-worker2 k8s-worker3 node-type-
```

#### 6. Results

Check OOB performance data [here](/opea_release_data.md#chatqna), tuned performance data will be released soon.
49 changes: 49 additions & 0 deletions opea_release_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# OPEA Release Data

This page shows the benchmark data of GenAIExamples. More data for different examples will be submitted in the future release.

## ChatQnA

| **Docker Images for Test** |
| ----------------------------------------------------- |
| opea/embedding-tei:v0.9 |
| ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 |
| opea/llm-tgi:v0.9 |
| ghcr.io/huggingface/tgi-gaudi:2.0.1 |
| opea/dataprep-redis:v0.9 |
| redis/redis-stack:7.2.0-v9 |
| opea/reranking-tei:v0.9 |
| opea/tei-gaudi:v0.9 |
| opea/retriever-redis:v0.9 |
| opea/chatqna:v0.9 |

System Summary:
1-node, 2x Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz, 40 cores, 270W TDP, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 1024GB (32x32GB DDR4 3200 MT/s [3200 MT/s]), BIOS ETM02, microcode 0xd0003b9, 8x Habana Labs Ltd., 4x MT28800 Family [ConnectX-5 Ex], 4x 7T INTEL SSDPF2KX076TZ, 2x 894.3G SAMSUNG MZ1L2960HCJR-00A07, Ubuntu 22.04.3 LTS, 5.15.0-92-generic. Software: WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_SW. Test by Intel as of 08/20/24.

### Performance Data

| 1Node E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency |
| :-------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: |
| OOB w/o Reranking | 1 | 128 | 128 | 128 | 5.597 | 7.59 |
hshen14 marked this conversation as resolved.
Show resolved Hide resolved
| OOB w/ Reranking | 1 | 128 | 128 | 128 | 6.003 | 8.123 |

| 2Nodes E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency |
| :--------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: |
| OOB w/o Reranking | 2 | 256 | 128 | 128 | 7.05 | 9.122 |
| OOB w/ Reranking | 2 | 256 | 128 | 128 | 7.26 | 9.239 |

| 4Nodes E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency |
| :--------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: |
| OOB w/o Reranking | 4 | 512 | 128 | 128 | 16.293 | 21.169 |
| OOB w/ Reranking | 4 | 512 | 128 | 128 | 17.22 | 21.942 |

Go to Benchmark [README](./ChatQnA/benchmark/README.md) for reproduce steps, tuned performance data will be released soon.

### Accuracy Data

| Test Case | Hits@10 | Hits@4 | MAP@10 | MRR@10 |
| :---------------------: | :-----: | :----: | :----: | :----: |
| Retrieval w/o Reranking | 66.16% | 49.80% | 17.62% | 39.75% |
| Retrieval w/ Reranking | 72.28% | 63.24% | 24.97% | 56.79% |

Go to Accuracy [README](https://github.com/opea-project/GenAIEval/tree/main/evals/evaluation/rag_eval#multihop-english-dataset) for reproduce steps.