Stable release 3.1.1

Latest

Latest

tiberiu44 released this 09 Feb 06:45

· 3 commits to master since this release

This release include support for tagging, parsing, tokenizing, sentence splitting and lemmatizing of raw text.

It was evaluated during the CONLL Shared Task on Universal Dependencies Parsing and has pretrained languages models for the entire UD Corpus.

Features

Model store with pretrained (selected) languages
Training pipeline for building custom models
Supports multiple language models: transformer, fasttext, languasito, dummy (no embeddings)
Updated models with large improvements in the F-Score
Flavours: build a joint model using multiple treebanks at the same time and language code conditioning (increses performance in most cases)

Assets 2