Features

UER-py has the following features:

Reproducibility UER-py has been tested on many datasets and should match the performances of the original pre-training model implementations such as BERT, GPT-2, ELMo, and T5.
Model modularity UER-py is divided into the following components: embedding, encoder, target embedding (optional), decoder (optional), and target. Ample modules are implemented in each component. Clear and robust interface allows users to combine modules to construct pre-training models with as few restrictions as possible.
Model training UER-py supports CPU mode, single GPU mode, distributed training mode, and gigantic model training with DeepSpeed
Model zoo With the help of UER-py, we pre-train and release models of different properties. Proper selection of pre-trained models is important to the performances of downstream tasks.
SOTA results UER-py supports comprehensive downstream tasks (e.g. classification and machine reading comprehension) and provides winning solutions of many NLP competitions.
Abundant functions UER-py provides abundant functions related with pre-training, such as feature extractor and text generation.

Provide feedback