Test data for the news translation task at WMT 2017 for the language pair English-Finnish. The test set contains 3,000 verified translations provided by professional translators. 1,500 sentences are translated from Finnish to English and 1,500 sentences from English to Finnish.
original/
- original documentstranslated/
- translations provided by human translatorssgm/
- official WMT test sets in SGML formattxt/
- plain text versions of the official WMT test setsmt/
- machine-translated texts
- 84 documents
- 1,500 sentences
- 16,385 space-separated tokens (wthout proper tokenization)
- 64 documents
- 1,500 sentences
- 29,506 space-separated tokens (wthout proper tokenization)
All 1,500 English sentences have been translated twice by independent translators. They can be used for multi-reference evaluation or tuning.
The package is distribute with the Creative Commons Attribution 4.0 license.
The translation costs were covered by the Faculty of Arts at the University of Helsinki. We would like to thank Anna Missilä and Lingsoft for providing the translations and Maarit Koponen for selecting the Finnish documents to be translated.