A collection of links to Ukrainian language tools
- improved results for Finnish - Ukrainian
- LDC data sets
- New parallel data: MultParaCrawl v9b and SUMMA including Ukrainian
- Added link to another list of Ukrainian NLP
- Updated translation models at Huggingface
- Bilingual pictograms in various languages
- big, base and tiny transformer models from OPUS-MT
- new translation tools with Ukrainian language support
- updated table with new OPUS-MT models with benchmarks on flores101 scores
- HelpUkraineBot - Latvia’s assistance to Ukraine
- OPUS-MT Telegram Translation Bot: https://t.me/opusmt_bot
- Ukrainian - Czech Telegram Translation Bot: http://t.me/uk_cs_translation_bot
- Ukrainian - Czech Messenger Translation Bot: https://m.me/uk.cs.translation.bot
- more about translation bots: https://github.com/martin-majlis/uk-cs-translation-bot
- Bergamot client-side browser-based translator
- UFAL translator for Czech - Ukrainian (project website)
- OPUS-MT web-interface
- Google translate
- Microsoft Bing translator
- AppTek translator
- Tilde trainslator
- Best OPUS-MT models for Ukrainian evaluated on the flores101 devtest benchmark
- Tiny Transformer OPUS-MT models for Ukrainian evaluated on the flores101 devtest benchmark
- Base Transformer OPUS-MT models for Ukrainian evaluated on the flores101 devtest benchmark
- Big Transformer OPUS-MT models for Ukrainian evaluated on the flores101 devtest benchmark
- Full list of OPUS-MT models
Computer-aided translation / interactive translation:
- OPUS-CAT: plugins for common translation tools, downloadable MT models from OPUS-MT
- OPUS-MT-app and translatelocally - speed-optimized local translation - now with models for Ukrainian; also available as client-side web browser app
Deployable MT server solutions, web apps, docker containers:
- OPUS-MT, basically all released OPUS-MT and Tatoeba MT models can be used
- OPUS-MT docker containers
Huggingface:
Example use for hf models (Ukrainian - English):
from transformers import pipeline
pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-big-zle-en")
print(pipe("Мене звати Вольфґанґ і я живу в Берліні."))
# [{'translation_text': 'My name is Wolfgang and I live in Berlin.'}]
European Language Grid:
- OPUS, use OPUS-API to search for resources, for example all bitexts for English-Ukrainian
- MultParaCrawl - Crawled parallel corppra including Ukrainian
- SUMMA - Ukrainian - English data
- Tatoeba MT Challenge data sets
- OPUS-MT-testsets, various sources
- Polish-Ukrainian Parallel Corpus at ELG
- Back-translated monolingual Wiki data
Tools to handle data:
- OPUS-tools - search and download OPUS data
- mt-data - download data sets from various sources
- OpusFilter - filter / cleanup data sets
- Vosk, an open-source realtime ASR system based on Kaldi including a Ukrainian model
- ASR for Ukrainian at ELG
- Links to ASR, TTS and speech resources for Ukrainian
- Models at huggingface
Resources from ELRA, may be provided free of charge. Please contact [email protected]:
- http://catalog.elra.info/en-us/repository/browse/ELRA-S0043/
- http://catalog.elra.info/en-us/repository/browse/ELRA-S0377/
- http://catalog.elra.info/en-us/repository/browse/ELRA-S0378/
- http://catalog.elra.info/en-us/repository/browse/ELRA-S0399/
- http://catalog.elra.info/en-us/repository/browse/ELRA-S0400/
- more info
Resources from ELG:
- pymorph2 - morphological analyzer for Russian and Ukrainian