This repo is to keep track of our work to improve on the current NLP tools for Dutch. We are interested in the following features::
- semantic role labelling
- co-refence resolution
- dependency parsing
This is a field of active research, where a lot of progress is made using new machine learning techniques. Much of the research is done internationally and focusses on English. Our aim is to apply some of this work to Dutch.
The machine learning approaches crucially depend on the amount (and quality) of the data. Therefore, we will also inventory available data sets for supervised and unsupervised learning.
Note that:
- POS tagging and dependency parsing was the topic of the CoNLL2018 shared task. You can find reports and proceedings here.
- Semantic role labeling was the topic of the CoNLL2004 and 2005 shared task. You can find reports and proceedings here
- NER was the topic of the CoNLLL2002 shared task. See here