Skip to content

Latest commit

 

History

History
77 lines (53 loc) · 4.96 KB

TUTORIAL_4_ELMO_BERT_FLAIR_EMBEDDING.md

File metadata and controls

77 lines (53 loc) · 4.96 KB

Tutorial 4: List of All Word Embeddings

This is not so much a tutorial, but rather a list of all embeddings that we currently support in Flair. Click on each embedding in the table below to get usage instructions. We assume that you're familiar with the base types of this library as well as standard word embeddings, in particular the StackedEmbeddings class.

Overview

All word embedding classes inherit from the TokenEmbeddings class and implement the embed() method which you need to call to embed your text. This means that for most users of Flair, the complexity of different embeddings remains hidden behind this interface. Simply instantiate the embedding class you require and call embed() to embed your text.

The following word embeddings are currently supported:

Class Type Paper
BytePairEmbeddings Subword-level word embeddings Heinzerling and Strube (2018)
CharacterEmbeddings Task-trained character-level embeddings of words Lample et al. (2016)
ELMoEmbeddings Contextualized word-level embeddings Peters et al. (2018)
FastTextEmbeddings Word embeddings with subword features Bojanowski et al. (2017)
FlairEmbeddings Contextualized character-level embeddings Akbik et al. (2018)
OneHotEmbeddings Standard one-hot embeddings of text or tags -
PooledFlairEmbeddings Pooled variant of FlairEmbeddings Akbik et al. (2019)
TransformerWordEmbeddings Embeddings from pretrained transformers (BERT, XLM, GPT, RoBERTa, XLNet, DistilBERT etc.) Devlin et al. (2018) Radford et al. (2018) Liu et al. (2019) Dai et al. (2019) Yang et al. (2019) Lample and Conneau (2019)
WordEmbeddings Classic word embeddings

Combining BERT and Flair

You can very easily mix and match Flair, ELMo, BERT and classic word embeddings. All you need to do is instantiate each embedding you wish to combine and use them in a StackedEmbedding.

For instance, let's say we want to combine the multilingual Flair and BERT embeddings to train a hyper-powerful multilingual downstream task model. First, instantiate the embeddings you wish to combine:

from flair.embeddings import FlairEmbeddings, TransformerWordEmbeddings

# init Flair embeddings
flair_forward_embedding = FlairEmbeddings('multi-forward')
flair_backward_embedding = FlairEmbeddings('multi-backward')

# init multilingual BERT
bert_embedding = TransformerWordEmbeddings('bert-base-multilingual-cased')

Now instantiate the StackedEmbeddings class and pass it a list containing these three embeddings.

from flair.embeddings import StackedEmbeddings

# now create the StackedEmbedding object that combines all embeddings
stacked_embeddings = StackedEmbeddings(
    embeddings=[flair_forward_embedding, flair_backward_embedding, bert_embedding])

That's it! Now just use this embedding like all the other embeddings, i.e. call the embed() method over your sentences.

sentence = Sentence('The grass is green .')

# just embed a sentence using the StackedEmbedding as you would with any single embedding.
stacked_embeddings.embed(sentence)

# now check out the embedded tokens.
for token in sentence:
    print(token)
    print(token.embedding)

Words are now embedded using a concatenation of three different embeddings. This means that the resulting embedding vector is still a single PyTorch vector.

Next

You can now either look into document embeddings to embed entire text passages with one vector for tasks such as text classification, or go directly to the tutorial about loading your corpus, which is a pre-requirement for training your own models.