HindiMonoCorp Extraction #2
Labels
GSSoC
Issues relating to GirlScript Summer of Code 2018
Intermediate
The intermediate difficulty level issues for GirlScript Summer of Code.
Pro
The harder difficulty level issues for GirlScript Summer of Code.
Extraction of HindiMonoCorp as based on issue SangitaNLP/sangita#8
Tasks include:
Extraction of linguistic features in (word, tag) tuples
Storing them in usable and importable formats.
Identifying the POS tagset used and converting them to the Penn Treebank tagset.
The text was updated successfully, but these errors were encountered: