Skip to content

ipavlopoulos/padoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PaDoc

License: CC BY 4.0

This is a dataset comprising transcriptions from 389 manuscripts, dated from the 3rd century BCE to the 7th century CE, originating mainly from Greco-Roman Egypt. A study, titled "Dating Greek Papyri with Text Regression" and introducing this dataset, will be presented as a long article at ACL 2023 in Toronto, Canada. The abstract follows:

Dating Greek papyri accurately is crucial not only to edit their texts but also to understand numerous other aspects of ancient writing, document and book production and circulation, as well as various other aspects of administration, everyday life and intellectual history of antiquity. Although a substantial number of Greek papyri documents bear a date or other conclusive data as to their chronological placement, an even larger number can only be dated tentatively or in approximation, due to the lack of decisive evidence. By creating a dataset of 389 transcriptions of documentary Greek papyri, we train 389 regression models and we predict a date for the papyri with an average MAE of 54 years and an MSE of 1.17, outperforming image classifiers and other baselines.

If you use this dataset or findings of this research, please cite us as:

@inproceedings{pavlopoulos-etal-2023,
  title={Dating Greek Papyri with Text Regression},
  author={Pavlopoulos, John and Konstantinidou, Maria and Paparigopoulou, Asimina and Essler, Holger and Marthot-Santaniello, Isabelle},
  booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL'23; to appear)},
  year={2023}
}

About

Dating Greek Papyri with Text Regression

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published