Release Release 0.9 · flairNLP/flair

With release 0.9 we are refactoring Flair for simplicity and speed, to make Flair faster and more easily scale to new NLP tasks. The first new tasks included in this release are Relation Extraction (RE), support for GLUE benchmark tasks and Entity Linking - all in beta for early adopters! We're working towards a Flair 1.0 release that will span the whole suite of standard NLP tasks. Also included is a new approach for Zero-Shot Sequence Labeling based on TARS! This release also includes a wealth of new datasets for all these tasks and tons of other new features and bug fixes.

Zero-Shot Sequence Labeling with TARS (#2260)

We extend the TARS zero-shot learning approach to sequence labeling and ship a pre-trained model for English NER. Try defining some classes and see if the model can find them:

# 1. Load zero-shot NER tagger
tars = TARSTagger.load('tars-ner')

# 2. Prepare some test sentences
sentences = [
    Sentence("The Humboldt University of Berlin is situated near the Spree in Berlin, Germany"),
    Sentence("Bayern Munich played against Real Madrid"),
    Sentence("I flew with an Airbus A380 to Peru to pick up my Porsche Cayenne"),
    Sentence("Game of Thrones is my favorite series"),
]

# 3. Define some classes of named entities such as "soccer teams", "TV shows" and "rivers"
labels = ["Soccer Team", "University", "Vehicle", "River", "City", "Country", "Person", "Movie", "TV Show"]
tars.add_and_switch_to_new_task('task 1', labels, label_type='ner')

# 4. Predict for these classes and print results
for sentence in sentences:
    tars.predict(sentence)
    print(sentence.to_tagged_string("ner"))

This should print:

The Humboldt <B-University> University <I-University> of <I-University> Berlin <E-University> is situated near the Spree <S-River> in Berlin <S-City> , Germany <S-Country>

Bayern <B-Soccer Team> Munich <E-Soccer Team> played against Real <B-Soccer Team> Madrid <E-Soccer Team>

I flew with an Airbus <B-Vehicle> A380 <E-Vehicle> to Peru <S-City> to pick up my Porsche <B-Vehicle> Cayenne <E-Vehicle>

Game <B-TV Show> of <I-TV Show> Thrones <E-TV Show> is my favorite series

So in these examples, we are finding entity classes such as "TV show" (Game of Thrones), "vehicle" (Airbus A380 and Porsche Cayenne), "soccer team" (Bayern Munich and Real Madrid) and "river" (Spree), even though the model was never explicitly trained for this. Note that this is ongoing research and the examples are a bit cherry-picked. We expect the zero-shot model to improve quite a bit until the next release.

New NLP Tasks and Datasets

We prototypically now support new tasks such as GLUE benchmark, Relation Extraction and Entity Linking. With this, we ship the datasets and model classes you need to train your own models. But we are still tweaking both methods, meaning that we don't ship any pre-trained models as-of-yet.

GLUE Benchmark (#2149 #2363)

A standard benchmark to evaluate progress in language understanding, mostly consisting of single and pairwise sentence classification tasks.

New datasets in Flair:

'GLUE_COLA' - The Corpus of Linguistic Acceptability from GLUE benchmark
'GLUE_MNLI' - The Multi-Genre Natural Language Inference Corpus from the GLUE benchmark
'GLUE_RTE' - The RTE task from the GLUE benchmark
'GLUE_QNLI' - The Stanford Question Answering Dataset formated as NLI task from the GLUE benchmark
'GLUE_WNLI' - The Winograd Schema Challenge formated as NLI task from the GLUE benchmark
'GLUE_MRPC' - The MRPC task from GLUE benchmark
'GLUE_QQP' - The Quora Question Pairs dataset where the task is to determine whether a pair of questions are semantically equivalent

Initialize datasets like so:

from flair.datasets import GLUE_QNLI

# load corpus
corpus = GLUE_QNLI()

# print corpus
print(corpus)

# print first sentence-pair of training data split
print(corpus.train[0])

# print all labels in corpus
print(corpus.make_label_dictionary("entailment"))

Relation Extraction (#2333 #2352)

Relation extraction classifies if and which relationship holds between two entities in a text.

Model class: RelationExtractor

Datasets in Flair:

'RE_ENGLISH_CONLL04' - the CoNLL-04 Relation Extraction dataset (#2333)
'RE_ENGLISH_SEMEVAL2010' - the SemEval-2010 Task 8 dataset on Multi-Way Classification of Semantic Relations Between Pairs of Nominals (#2333)
'RE_ENGLISH_TACRED' - the TAC Relation Extraction Dataset](https://nlp.stanford.edu/projects/tacred/) with 41 relations (download required) (#2333)
'RE_ENGLISH_DRUGPROT' - the DrugProt corpus from Biocreative VII Track 1 on drug and chemical-protein interactions (#2340 #2352)

Initialize datasets like so:

# initalize CoNLL 04 corpus for Relation extraction
corpus = RE_ENGLISH_CONLL04()
print(corpus)

# print first sentence of training split with annotations
sentence = corpus.train[0]

# print label dictionary
label_dict = corpus.make_label_dictionary("relation")
print(label_dict)

Entity Linking (#2375)

Entity Linking goes one step further than NER and uniquely links entities to knowledge bases such as Wikipedia.

Model class: EntityLinker

Datasets in Flair:

'NEL_ENGLISH_AIDA' - the AIDA CoNLL-YAGO Entity Linking corpus on the CoNLL-03 dataset for English
'NEL_ENGLISH_AQUAINT' - the Aquaint Entity Linking corpus introduced in Milne and Witten (2008)
'NEL_ENGLISH_IITB' - the ITTB Entity Linking corpus introduced in Sayali et al. (2009)
'NEL_ENGLISH_REDDIT' - the Reddit Entity Linking corpus introduced in Botzer et al. (2021) (only gold annotations)
'NEL_ENGLISH_TWEEKI' - the ITTB Entity Linking corpus introduced in Harandizadeh and Singh (2020)
'NEL_GERMAN_HIPE' - the HIPE Entity Linking corpus for historical German as a sentence-segmented version

from flair.datasets import NEL_ENGLISH_REDDIT

# load corpus
corpus = NEL_ENGLISH_REDDIT()

# print corpus
print(corpus)

# print a sentence of training data split
print(corpus.train[3])

New NER Datasets

'NER_ARABIC_ANER' - Arabic Named Entity Recognition Corpus 4-class NER (#2188)
'NER_ARABIC_AQMAR' - American and Qatari Modeling of Arabic 4-class NER (modified) (#2188)
'NER_ENGLISH_PERSON' - NER for person names (#2271)
'NER_ENGLISH_WEBPAGES' - 4-class NER on web pages from Ratinov and Roth (2009) (#2232 )
'NER_GERMAN_POLITICS' - NEMGP corpus for German politics (#2341)
'NER_JAPANESE' - Japanese NER dataset automatically generated from Wikipedia (#2154)
'NER_MASAKHANE' - MasakhaNER: Named Entity Recognition for African Languages corpora (#2212, #2214, #2227, #2229, #2230, #2231, #2222, #2234, #2242, #2243)

Other datasets

'YAHOO_ANSWERS' - The 10 largest main categories from the Yahoo! Answers (#2198)
Various Universal Dependencies datasets (#2211, #2216, #2219, #2221, #2244, #2245, #2246, #2247, #2223, #2248, #2235, #2236, #2239, #2226)

New Functionality

Support for Arabic NER (#2188)

Flair now supports NER and POS tagging for Arabic. To tag an Arabic sentence, just load the appropriate model:

# load model
tagger = SequenceTagger.load('ar-ner')

# make Arabic sentence
sentence = Sentence("احب برلين")

# predict NER tags
tagger.predict(sentence)

# print sentence with predicted tags
for entity in sentence.get_labels('ner'):
    print(entity)

This should print:

LOC [برلين (2)] (0.9803)

More flexibility on main metric (#2161)

When training models, you can now chose any standard evaluation metric for model selection (previously it was fixed to micro F1). When calling the trainer, simply pass the desired metric as main_evaluation_metric like so:

trainer.train('resources/taggers/your_model',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=10,
              main_evaluation_metric=("macro avg", 'f1-score'),
              )

In this example, we now use macro F1 instead of the default micro F1.

Add handling for mapping labels to 'O' #2254

In ColumnDataset, labels can be remapped to other labels. But sometimes you may not wish to use all label types in a given dataset.
You can now remap them to 'O' and so exclude them.

For instance, to load CoNLL-03 without MISC, do:

corpus = CONLL_03(
    label_name_map={'MISC': 'O'}
)
print(corpus.make_label_dictionary('ner'))
print(corpus.train[0].to_tagged_string('ner'))

Other

add per-label thresholds for prediction (#2366)
add support for Spanish clinical Flair embeddings (#2323)
added 'mean', 'max' pooling strategy for TransformerDocumentEmbeddings class (#2180)
new DocumentCNNEmbeddings class to embed text with a trainable CNN (#2141)
allow negative examples in ClassificationCorpus (#2233)
added new parameter to save model each k epochs during training (#2146)
log epoch of best model instead of printing it during training (#2286)
add option to exclude specific sentences from dataset (#2262)
improved tensorboard logging (#2164)
return predictions during evaluation (#2162)

Internal Refactorings

Refactor for simplicity and extensibility (#2333 #2351 #2356 #2377 #2379 #2382 #2184)

In order to accommodate all these new NLP task types (plus many more in the pipeline), we restructure the flair.nn.Model class such that most models now inherit from DefaultClassifier. This removes many redundancies as most models do classification and are really only different in what they classify and how they apply embeddings. Models that inherit from DefaultClassifier need only implement the method forward_pass, making each model class only a few lines of code.

Check for instance our implementation of the RelationExtractor class to see how easy it now is to add a new tasks!

Refactor for speed

Flair models trained with transformers (such as the FLERT models) were previously not making use of mini-batching, greatly slowing down training and application of such models. We refactored the TransformerWordEmbeddings class, yielding significant speed-ups depending on the mini-batch size used. We observed speed-ups from x2 to x6. (#2385 #2389 #2384)
Improve training speed of Flair embeddings (#2203)

Bug fixes & improvements

fixed references to multi-X-fast Flair embedding models (#2150)
fixed serialization of DocumentRNNEmbeddings (#2155)
fixed separator in cross-attention mode (#2156)
fixed ID for Slovene word embeddings in the doc (#2166)
close log_handler after training is complete. (#2170)
fixed bug in IMDB dataset (#2172)
fixed IMDB data splitting logic (#2175)
fixed XLNet and Transformer-XL Execution (#2191)
remove unk token from Ner labeling (#2225)
fxed typo in property name (#2267)
fixed typos (#2303 #2373)
fixed parallel corpus (#2306)
fixed SegtokSentenceSplitter Incorrect Sentence Position Attributes (#2312)
fixed loading of old serialized models (#2322)
updated url for BioSemantics corpus (#2327)
updated requirements (#2346)
serialize multi_label_threshold for classification models (#2368)
small refactorings in ModelTrainer (#2184)
moving Path construction of flair.cache_root (#2241)
documentation improvement (#2304)
add model fit tests #2378

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.9