HedgeTop

Background

As in previous years, the Conference on Natural Language Learning (CoNLL) defines a shared task, essentially an invitation to research groups to compete in solving a well-defined problem. The 2010 shared task is on finding so-called hedges (linguistic expressions indicating various degrees on uncertainty) and determining their scopes. This tasks is similar in some respects to Task 3 in the 2009 BioNLP shared task, and it seems likely that structural syntactic or semantic information will be useful in solving this problem.

The CoNLL shared task is sub-divided into two parts: Task 1, on classifying complete utterances as to whether or not they contain speculation, and Task 2 on determining the scope of actual hedges; the way Task 2 is set up, it also subsumes the task of finding the actual hedge cues. For the use of DELPH-IN technology (and other parsers), Task 2 appears most relevant. Although the CoNLL shared task description mentions some use of Wikipedia text, it appears that most of the training data (maybe all of it, for Task 2) is drawn from the bio-medical domain, specifically the BioScope corpus, a resource annotated for hedge cues and scopes.

As sub-set of DELPH-IN members plan to participate in this shared task, possibly through one joint submission or maybe als as several submissions, by individual groups or sub-sets of people. At present, these include Lilja Øvrelid, Stephan Oepen, and Erik Velldal (at Oslo), Tim Baldwin, Andrew MacKinlay, and David Martinez (at Melbourne), Yi Zhang (at Saarbrücken), and Dan Flickinger (at Stanford). While we are just getting going (in mid-January 2010), more collaborators would be welcome, please contact Stephan for details.

Relevant Resources

BioScope corpus
GENIA project

Reading List

BioScope Annotation Guidelines (v.2.1)
Veronika Vincze, György Szarvas, Richárd Farkas, György Mora, and János Csirik

(2008). The BioScope Corpus: Biomedical Texts Annotated for Uncertainty, Negation and their Scopes.
Andrew MacKinlay, David Martinez and Timothy Baldwin (2009). A Parser-based Approach to Detecting Modification of Biomedical Events.
Roser Morante and Walter Daelemans (2009)

Learning the Scope of Hedge Cues in Biomedical Texts.
Viola Ganter and Michael Strube (2009).

Finding Hedges by Chasing Weasels: Hedge Detection Using Wikipedia Tags and Shallow Linguistic Features.
Halil Kilicoglu and Sabine Bergler (2009).

Syntactic Dependency Based Heuristics for Biological Event Extraction.
Halil Kilicoglu and Sabine Bergler (2008).

Recognizing Speculative Language in Biomedical Research Articles: A Linguistically Motivated Perspective.
Ben Medlock and Ted Briscoe (2007).

Weakly Supervised Learning for Hedge Classification in Scientific Literature.

Home | Forum | Discussions | Events

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HedgeTop

Background

Relevant Resources

Reading List

Clone this wiki locally