Skip to content

Converting CROHME dataset for Online-handwritting recognition to Offline-handwritting recognition.

License

Notifications You must be signed in to change notification settings

summerlvsong/offline-crohme

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is offline_CROHME?

offline_CROHME is a tool for converting CROHME to Offline-handwritting text recognition dataset. This tool help to convert INKML file format to image and groud-truth latex.

What is CROHME?

The dataset provides more than 12 000 expressions handwritten by hundreds of writers from different countries, merging the data sets from 4 previous CROHME competitions and adding new ressources. Writers were asked to copy printed expressions from a corpus of expressions. The corpus has been designed to cover the diversity proposed by the different tasks and chosen from an existing math corpus and from expressions embedded in Wikipedia pages. Different devices have been used (different digital pen technologies, white-board input device, tablet with sensible screen), thus different scales and resolutions are used. The dataset provides only the on-line signal.

How to use?

In the easy way, run below command:

python extract.py

Output data must be stored in CROHME_processed folder. Each INKML file will be extracted to a image file and groud-truth latex string in a text file.

About

Converting CROHME dataset for Online-handwritting recognition to Offline-handwritting recognition.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%