-
Notifications
You must be signed in to change notification settings - Fork 4
BerlinFrontiers
Discussion -- Pushing the Frontiers: Expanding our research to new languages and applications and setting new research goals - Strategies for Dissemination (Moderator: HansUszoreit)
Shared tasks and resources:
- common benchmark for base coverage: parallel corpora treebanks for the participating languages
- shared tasks for HPSG processing: a. abstract processing exercises, b. processing with respect to concrete applications
- shared tasks for applications
- cross-framework evaluations
Applications:
- information management (relation extraction, incl. event and opinion detection)
- machine translation (in combination with other checking methods)
- grammar checking (in combination with other checking methods)
- dialogue systems (e.g., for web agents and computer games)
- Others?
Steps towards applications:
HU: generation, exploitation of application semantics for getting to the meaning of applications;
AC: we should work on resource semantics for different applications
Steps towards shared tasks and resources:
Shared corpora:
- -- Europarl -- parallel corpora based on touristic brochures,
guides, etc., which are already translated in many languages, but
which will also have to be translated to many more languages
-
AC: we should start with setting up the necessary machinery, even with smaller treebanks, even of different kinds of texts SO: a single coherent corpus DF: we collect the corpus by picking up parts/sentences from different kinds of texts for the various participating languages/grammars
HU: collect the languages we want to participate, people who will be responsible for annotating/parsing the corpus, choose the corpus, which is not too highly marked stylistically and has been translated to many other languages --> city/region descriptions, cathedral essay (on Francis' suggestion, translated into all the languages we are working on, approx. 800 sentences), novels, linux/technical documentation, everything; do the languages' matrix and see whether there would still be gaps; Tasks: -- languages: en (Stanford/Oslo), no (Trondheim), pt (Lisbon), es/ca (Barcelona), ja (Kyoto), de (Saarbrücken), el (Saarbrücken/Athens), sw (Linköping), fr (Toulouse), zh (Saarbrücken), ko (Seoul?) -- Saarbrücken builds the Wiki page by the 1st of September; -- the groups mentioned above submit text to the Wiki page by mid October, accompanying them with a short description in order to know how the text in the various translations correlate to each other
-
Home | Forum | Discussions | Events