Skip to content

TGoldsack1/BioLaySumm2024-evaluation_scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioLaySumm Evaluation Scripts

This repository contains the scripts used for the evaluation of the BioLaySumm 2024 Shared Task.

The scripts are configured to run on the validation data provided to participants.


Setup

Before running the scripts, you must download the AlignScore and LENS models. This can be done by running get_models.sh script:

bash ./get_models.sh

Additionally, you will need to install the other dependencies. The easiest way to do this is to use the provided requirements.txt file.

pip install -r requirements.txt

Running Evaluation

Once setup is complete, you can run the evaluation script, evaluate.py on your predicted summaries. It is advised to run the evaluation on a GPU due to the use of model-based metrics. The script expects 2 positional arguments - the path to the directory containing the predicted summary text files (i.e., elife.txt and plos.txt) and the path to the directory containing provided validation .jsonl files:

evaluate.py /path/to/predicted/summaries /path/to/validation/data

The script will output the evaluation results as .txt files within the current directory.

Note: Even when running the evaluation on a GPU, the runtime for the evaluation script will be relatively long due to: 1) the large number of documents in the validation set (PLOS in particular), and 2) the longer runtime of model-based metrics (SummaC in particular). To run a quicker proxy evaluation, we would recommend editing the evaluate.py file to reduce the evaluation samples used for PLOS/SummaC.

About

Evaluation script for the BioLaySumm 2024 shared task.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published