Skip to content

UIMA Annotators available

Kateryna Tymoshenko edited this page Mar 14, 2017 · 2 revisions

Currently all the Experiment modules employ the preprocessing pipeline based on Stanford CoreNLP toolkit and Illinois chunker.

Currently, we use a custom type system defined for our pipeline (see desc/PipelineTypeSystem.xml in the distribution folder), however, in future we plan to use the type systems employed in other publicly available and widely used UIMA pipelines, e.g. DKPro.

Additionally, we provide two annotators specifically for question answering.

Question classifier

Currently the system employs question classifiers trained using SVMLight-TK implementation of the SST-bow kernel (see Learning Adaptable Patterns for Passage Reranking for details) on the UIUC question classification dataset.

We provide two sets of question classifiers:

  • Coarse-grained question classifiers. It assigns the following question classes: HUMan, LOCation, DESCription, ENTitY, ABBReviation, NUMber. Its accuracy on the UIUC test data is 87.20%.

  • Coarse-grained question classifiers. It assigns the following question classes, but NUMber is split into QUANTITY, PERCENT, DATE, DURATION. Its accuracy on the UIUC test data is 85.60%.

Question focus annotator

Predicts one word to be question focus.