Skip to content

Data Files Contributions

Shreeshrii edited this page Jun 23, 2018 · 16 revisions

This page lists repositories with 4.0.0 compatible tessdata contributions by Tesseract community.

Such tessdata contributions should ideally document everything needed to reproduce the training process (fonts, images, ground truth, texts, scripts, documentation, ...).


Language Code Language Data File Contributor Info
khmLimon Khmer best OpenInstituteCambodia/phyrumsk PR in tessdata_best
cop Coptic best shreeshrii/tessdata_coptic tesseract-ocr forum post
iast1 Romanized Sanskrit best shreeshrii/tessdata_sanskrit/ tesseract-ocr forum post

As of 02/02/2020


These wiki pages are no longer maintained.

All pages were moved to tesseract-ocr/tessdoc.

The latest documentation is available at https://tesseract-ocr.github.io/.


Clone this wiki locally