Awesome-Transformers

A list of transformers

Attention Is All You Need, NIPS 2017 (paper) originial paper
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, ACL 2019 (paper) (pytorch & tensorflow code) segment-level recurrence, and relative position encoding
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction, ACL 2019 (paper) (pytorch code) used for KG's tuples
Adaptive Attention Span in Transformers, ACL 2019 (paper) (pytorch code) adaptive attention span
XLNet: Generalized Autoregressive Pretraining for Language Understanding, arxiv 2019 (paper) (tensorflow code) permutation language model
Syntactically Supervised Transformers for Faster Neural Machine Translation, ACL 2019 (paper) (pytorch code) non-autoregressive decoding
Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction, ACL 2019 (paper) (pytorch code) relation extraction
Learning Deep Transformer Models for Machine Translation, ACL 2019 (paper) (pytorch code) residual
Large Batch Optimization for Deep Learning: Training BERT in 76 Minutes, arxiv 2019 (paper) distributed computation
Universal Transformers, ICLR 2019 (paper) (tensorflow code) recurrent Transformer blocks
Lattice Transformer for Speech Translation, ACL 2019 (paper) on lattice (directed acyclic graph, a.k.a., DAG)
ERNIE: Enhanced Language Representation with Informative Entities, ACL 2019 (paper) (code)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md

Provide feedback