Pytorch implementation of the independent model with BERT and SpanBERT proposed in BERT for Coreference Resolution: Baselines and Analysis and SpanBERT: Improving Pre-training by Representing and Predicting Spans.
This implementation contains additional scripts and configurations I used in the context of my master's thesis. That includes the use of other pre-trained language models, training the model on various German datasets and improving its performance for low resource languages leveraging transfer learning. For the vanilla versions of the fundamental coreference resolution models see the Repository Overview.
This implementation is based upon the original implementation by the papers authors. The model and misc_ packages are written from scratch, whereas some scripts in the setup package and the entire eval package are borrowed with almost no changes from the original implementation. Optimization for mention pruning inspired by Ethan Yang.
This repository contains three additional branches of different PyTorch models. These are reimplemantations of models originally implemented with Tensorflow. The following tables gives an overview over the branches, the corresponding papers and original implementations.
This project was written with Python 3.8.5 and PyTorch 1.7.1. For installation details regarding PyTorch please visit the
official website. Further requirements are listed in the requirements.txt and can be
installed via pip: pip install -r requirements.txt
Hint: Run setup.sh in an environment with Python 2.7 so the CoNLL-2012 scripts are executed by the correct interpreter
To obtain all necessary data for training, evaluation, and inference in English run setup.sh with the path to the OntoNotes 5.0 folder (often named ontonotes-release-5.0).
e.g. $ ./setup.sh /path/to/ontonotes-release-5.0
Run python setup/bert_tokenize.py -c <conf>
to tokenize and segment the data before training, evaluating
or testing that specific configuration. See coref.cong for the available configurations.
The misc folder contains scripts to convert the German datasets Tüba-D/Z v10/11, SemEval-2010 and DIRNDL into the required CoNLL-2012 format. For the SemEval-2010 make sure to remove the singletons in order to get comparable results. Use minimize.py and bert_tokenize.py to obtain the desired file to train with.
Run the training with python train.py -c <config> -f <folder> --cpu --amp --check --split
.Select with
conf
one of the four available configurations (bert-base, bert-large, spanbert-base, spanbert-large). The
parameter folder
names the folder the snapshots, taken during the training, are saved to. If the given
folder already exists and contains at least one snapshot the training is restarted loading the latest snapshot. The
optional flags cpu
and amp
can be set to train exclusively on the CPU or to use the automatic
mixed precision training. Gradient checkpointing can be used with the option check
to further reduce the
GPU memory usage or the model can even be split up onto two GPUs with split
.
To fine-tune multilingual models trained on the OntoNotes 5.0 dataset on German datasets, adapt the configuration in coref.conf and place the latest snapshot into the same folder the fine-tuned model should write its snapshots to. Then start training as described above.
For easier parameter tuning use train_fine.py and write a shell script to programmatically pass the learning rates and epochs into the training.
To redo the adversarial training described in the thesis run python train_adv.py
. The only configuration
setup for this training is the bert-multilingual-base. Make sure to have created the adv_data_file besides the
English data before training.
To evalute the trained model on German comment in the desired dataset in the coref.conf. For validate if the training brought English and German embeddings closer together as desired use the analyze_emb_similarity.py script.
Run the evaluation with python evaluate.py -c <conf> -p <pattern> --cpu --amp --split
. All snapshots in the
data/ckpt folder that match the given pattern
are evaluated. This works with simple snapshots (pt) as
well as with snapshots with additional metadata (pt.tar). See Training for details on the remaining
parameters.
To evaluate previous predictions dumped during training or evaluation use the eval_dumped_preds.py script.