Add README for running OPEA ragas using HF endpoint on Gaudi (#137)

* minimized required fields/columns in user data Signed-off-by: aasavari <[email protected]> * add bench-target as the prefix of output folder (#133) Signed-off-by: Yingchun Guo <[email protected]> Signed-off-by: aasavari <[email protected]> * remove examples. (#135) Co-authored-by: root <[email protected]> Signed-off-by: aasavari <[email protected]> * minor naming correction to maintain consistency Signed-off-by: aasavari <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: aasavari <[email protected]> * Add hyperlinks and paths validation. (#132) Signed-off-by: ZePan110 <[email protected]> Signed-off-by: aasavari <[email protected]> * adding README for OPEA ragas Signed-off-by: aasavari <[email protected]> * adding python3 syntax to README Signed-off-by: aasavari <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: aasavari <[email protected]> Signed-off-by: Yingchun Guo <[email protected]> Signed-off-by: ZePan110 <[email protected]> Co-authored-by: Ying Chun Guo <[email protected]> Co-authored-by: lkk <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ZePan110 <[email protected]>
opea-project · Sep 24, 2024 · 0dff0d3 · 0dff0d3
1 parent f1593ea
commit 0dff0d3
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 4 deletions.
diff --git a/evals/metrics/ragas/README.md b/evals/metrics/ragas/README.md
@@ -0,0 +1,39 @@
+# OPEA adaption of ragas (LLM-as-a-judge evaluation of Retrieval Augmented Generation)
+OPEA's adaption of [ragas](https://github.com/explodinggradients/ragas) allows you to use [ragas](https://github.com/explodinggradients/ragas) on Intel's Gaudi AI accelerator chips. 
+
+## User data
+Please wrap your input data in `datasets.Dataset` class.  
+```python3
+from datasets import Dataset
+
+example = {
+    "question": "Who is wife of Barak Obama",
+    "contexts": [
+        "Michelle Obama, wife of Barak Obama (former President of the United States of America) is an attorney",
+        "Barak and Michelle Obama have 2 daughters - Malia and Sasha",
+    ],
+    "answer": "Michelle Obama",
+    "ground_truth": "Wife of Barak Obama is Michelle Obama",
+}
+dataset = Dataset.from_list([example])
+```
+
+## Launch HuggingFace endpoint on Intel's Gaudi machines
+Please follow instructions mentioned in [TGI Gaudi repo](https://github.com/huggingface/tgi-gaudi) with your desired LLM such as `meta-llama/Meta-Llama-3.1-70B-Instruct`. 
+
+## Run OPEA ragas pipeline using your desired list of metrics
+```python3
+# note - if you wish to use answer relevancy metric, please set the embedding parameter
+from langchain_community.embeddings import HuggingFaceBgeEmbeddings
+
+embeddings = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-base-en-v1.5")
+
+from ragas import RagasMetric
+
+ragas_metric = RagasMetric(threshold=0.5, model="<set your HF endpoint URL>", embeddings=embeddings)
+print(ragas_metric.measure(dataset))
+```
+That's it! 
+
+## Troubleshooting
+Please allow few minutes for HuggingFace endpoint to download model weights and load them. Larger models may take few more minutes. For any other issue, please file an issue and we will get back to you. 
diff --git a/evals/metrics/ragas/ragas.py b/evals/metrics/ragas/ragas.py
@@ -37,7 +37,7 @@ def __init__(
             "context_recall",
             "faithfulness",
             "context_utilization",
-            "reference_free_rubrics_score",
+            # "reference_free_rubrics_score",
         ]
 
     async def a_measure(self, test_case: Dict):
@@ -55,7 +55,6 @@ def measure(self, test_case: Dict):
                 context_recall,
                 context_utilization,
                 faithfulness,
-                reference_free_rubrics_score,
             )
         except ModuleNotFoundError:
             raise ModuleNotFoundError("Please install ragas to use this metric. `pip install ragas`.")
@@ -71,7 +70,7 @@ def measure(self, test_case: Dict):
             "context_recall": context_recall,
             "faithfulness": faithfulness,
             "context_utilization": context_utilization,
-            "reference_free_rubrics_score": reference_free_rubrics_score,
+            # "reference_free_rubrics_score": reference_free_rubrics_score,
         }
         # Set LLM model
         openai_key = os.getenv("OPENAI_API_KEY", None)
@@ -118,7 +117,6 @@ def measure(self, test_case: Dict):
             ]
         # Find necessary input fields using the given metrics
         _required_columns = set()
-        is_latest = faithfulness
         column_map = {  # this column maps new naming style in ragas to their old naming style
             "user_input": "question",
             "response": "answer",