Showcase Haystack evaluations on an industry dataset #7407

mrm1001 · 2024-03-22T09:46:31Z

User story

I would like to learn how to apply Haystack core evaluations to improve my RAG pipeline, with an example on how to improve my retriever component, on an industry dataset.

Sub-tasks:

create an example of how to improve Haystack evaluation metrics by tweaking: chunk size and/or embedding model (e.g. context size). Using: semantic similarity on answers and LLM-based metric context relevance.

More context here: https://www.notion.so/deepsetai/Evaluation-1521712b928d4142828232f2df136856?pvs=4

Tasks

Give feedback

find an industry dataset to showcase evaluation metrics #7438

P1 topic:eval
Update tutorial on model-based evaluation with Haystack #6790

2.x P2 topic:eval
Options

mrm1001 added the topic:eval label Mar 22, 2024

masci added the P2 Medium priority, add to the next sprint if no P1 available label Mar 22, 2024

masci mentioned this issue Mar 22, 2024

Custom LLM-based evaluator in Haystack core #7022

Closed

mrm1001 added this to the 2.1.0 milestone Mar 25, 2024

mrm1001 added P1 High priority, add to the next sprint and removed P2 Medium priority, add to the next sprint if no P1 available labels Mar 27, 2024

masci added P2 Medium priority, add to the next sprint if no P1 available and removed P1 High priority, add to the next sprint labels Mar 28, 2024

masci mentioned this issue Apr 7, 2024

LLM Evaluation in Haystack #6786

Closed

masci removed this from the 2.1.0 milestone Apr 7, 2024

masci changed the title ~~Create example of using core Haystack evaluations on industry dataset~~ Showcase Haystack evaluations on an industry dataset Apr 7, 2024

masci added the epic label Apr 7, 2024

masci closed this as completed May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Showcase Haystack evaluations on an industry dataset #7407

Showcase Haystack evaluations on an industry dataset #7407

mrm1001 commented Mar 22, 2024 •

edited by masci

Loading

Tasks

Showcase Haystack evaluations on an industry dataset #7407

Showcase Haystack evaluations on an industry dataset #7407

Comments

mrm1001 commented Mar 22, 2024 • edited by masci Loading

Tasks

mrm1001 commented Mar 22, 2024 •

edited by masci

Loading