This GIT repository accompanies the UKP seminar on Deep Learning for Natural Language Processing held at the University of Duisburg-Essen.
In contrast to other seminars, this seminar focuses on the usage of deep learning methods. As programming infrastructure we use Python in combination with Theano and Keras. The published code uses Python 2.7, Theano 0.8.2 and Keras 1.1.1. You should ensure that you have the frameworks installed in the right version (note: they change quickly).
This seminar is structured into 4 sessions:
- Feed-Forward Networks for Sequence Classification (e.g. POS, NER, Chunking)
- Convolutional Neural Network for Sentence / Text Classification (e.g. sentiment classification)
- Convolutional Neural Network for Relation Extraction (e.g. semantic relation extration)
- Long-Short-Term-Memory (LSTM)-Networks for Sequence Classificaiton
The seminar is inspired by an engineering mindset: The beautiful math and complexity of the topic is sometimes neglected to provide instead an easy-to-understand and easy-to-use approach to use Deep Learning for NLP tasks (we use what works without providing a full background on every aspect).
At the end of the seminar you should be able to understand the most important aspect of deep learning for NLP and be able to programm and train your own deep neural networks.
In case of questions, feel free to approach Nils Reimers.
The following is a short list with good introductions to different aspects of deep learning.
- 2009, Yoshua Bengio, Learning Deep Architectures for AI by Yoshua Bengio
- 2013, Richard Socher and Christopher Manning, Deep Learning for Natural Language Processing (slides and recording from NAACL 2013)
- 2015, Yoshua Bengio et al., Deep Learning - MIT Press book in preparation
- CS224d: Deep Learning for Natural Language Processing
- 2016 videos
- 2015, Yoav Goldberg, A Primer on Neural Network Models for Natural Language Processing
Slides: pdf
The first theory lesson covers the fundamentals of deep learning.
Slides: pdf
The second lesson gives an overview of deep learning frameworks. Hint: Use Keras and have a look at Theano and TensorFlow.
Slides: pdf
Code: See folder Session 1 - SENNA
The first code session is about the SENNA architecture (Collobert et al., 2011, NLP (almost) from scratch). In the folder you can find Python code for the preprocessing as well as Keras code to train and evaluate a deep learning model. The folder contains an example for Part-of-Speech tagging, which require the English word embeddings from Levy et al..
You can find in this folder also an example for German NER, based on the GermEval 2014 dataset. To run the German NER code, you need the word embeddings for German from our website.
Recommended Readings:
Slides: pdf
This is an introduction to Convolutional Neural Networks.
Recommended Readings:
Slides: pdf
Code: See folder Session 2 - Sentence CNN
This is a Keras implementation of the Kim, 2014, Convolutional Neural Networks for Sentence Classification. We use the same preprocessing as provided by Kim in his github repository but then implement the rest using Keras.
Slides: pdf
Code: See folder Session 3 - Relation CNN
This is an implementation for relation extraction. We use the SemEval 2010 - Task 8 dataset on semantic relations. We model the task as a pairwise classification task.
Recommended Readings:
- Zeng et al., 2014, Relation Classification via Convolutional Deep Neural Network
- dos Santos et al., 2015, Classifying Relations by Ranking with Convolutional Neural Networks
Slides: pdf
Code: See folder Session 4 - LSTM Sequence Classification
LSTMs are a powerful model and became very popular in 2015 / 2016.
Recommended Readings:
Slides: pdf
The folder contains a Keras implementation to perfrom sequence classification using LSTM. We use the GermEval 2014 dataset for German NER. But you can adapt the code easily to any other sequence classification problem (POS, NER, Chunking etc.). Check the slides for more information.