ZJU Internship

Some done work in ZJU during the internship period. Most about neural network frame using tensorflow.

The papers that I read during this period are recorded in this blog.

Environment

Python: 2.7 (partly support python 3.6 )

Tensorflow: 1.0.1

1. Traditional RNN Frame

Input :

A generated sequence under some rules.

Output :

To predict the next output.

Model :

Traditional RNN model.

reference:

深度学习（07）_RNN-循环神经网络-02-Tensorflow中的实现

2. Multiple LSTM/GRU Frame

Input :

The image of mnist.

Output :

The classification of number from 0 to 10.

Model :

Include traditional LSTM、traditional GRU、multiple Layer LSTM、multiple Layer GRU、traditional Bi-Directional LSTM、Multiple Layers BiLSTM model.

PS.:

The implement of different models using tensorflow framework.

To be familiar with the specific details of the implementation.

reference:

使用TensorFlow实现LSTM和GRU网络
零基础入门深度学习(6) - 长短时记忆网络(LSTM)
深入理解LSTM记忆元网络

3. TextCNN

Input :

The sequence of a movie review.

Output :

The emotion tendencies, positive or negative.

Model :

embedding layer =>Multiple convoluted layer with max-pooling layer =>Desor layer =>Desor layer =>softmax layer

PS.:

There are two version TextCNN, you can run

python run_textcnn_model_v1.py

or

python run_textcnn_model_v2.py

to get different version of textcnn.

the difference is the detail implementation in these two model.

the training data download url for word embedding is here : http://mattmahoney.net/dc/text8.zip

4. Attention mechanism for text classification tasks

Tensorflow implementation of attention mechanism for text classification tasks.
Inspired by "Hierarchical Attention Networks for Document Classification", Zichao Yang et al. (http://www.aclweb.org/anthology/N16-1174).

Note:

This is fork from other's.

https://github.com/ilivans/tf-rnn-attention

I edit some code to make this project can run on python 3.6.

Extend

深度解析注意力模型(attention model) --- image_caption的应用
heuritech.com - ATTENTION MECHANISM
浅谈Attention-based Model【原理篇】

5. BiLSTM-TextCNN

Aim:

To match the entities between different baidu baike and wikipedia.

Input:

Some element of one baidu baiku entity and 100 wikipedia entity candidates.

Output:

The wikipedia entity which has highest score.

Model

Use triple training.
embedding layer => bilstm layer => concat layer => textcnn layer => densor layer => score

Result

hyper parameter	Train accuracy	Val accuracy	Test accuracy
Filter number =16，Bath size =256	96.8%	Top1：21.7% Top10：65.8%	Top1：13.5% Top10：60%
Filter number =128，Bath size =256	100%	Top1：21.7% Top10：71.3%	Top1：17% Top10：64.4%
Filter number =64，Bath size =128	98.4%	Top1：14.7% Top10：65.1%	Top1：14.4% Top10：60.2%
Filter number =64，Bath size =32	100%	Top1：13.9% Top10：46.5%	Top1：9.25% Top10：48.2%

PS. :

For privaity, the model code can not be public. And the data is provided by Zhejiang University DCD lab.

6. Attention-BiLSTM-TextCNN

Under the base of BiLSTM-TextCNN, add one attention base model layer before bilstm.But the result is not good. Compared with the previous model, the accuracy rate dropped by 10%.

7. Conv-K-NRM

This is the Tensorflow implementation of Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search which completed by my friend 陈璐 @ 中山大学.

8. BiMPM + CNN

This is the Tensorflow implementation of Bilateral Multi-Perspective Matching for Natural Language Sentences which completed by my friend 郭悦 @ 中山大学. And based on this paper， my friend add one CNN layer to increase the accuracy about 8%.

Thank

鲁伟明教授 @ 浙江大学

王鹏 @ 浙江大学

陈璐 @ 中南大学

郭悦 @ 中山大学

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ZJU Internship

Environment

1. Traditional RNN Frame

Input :

Output :

Model :

reference:

2. Multiple LSTM/GRU Frame

Input :

Output :

Model :

PS.:

reference:

3. TextCNN

Input :

Output :

Model :

PS.:

4. Attention mechanism for text classification tasks

Note:

Extend

5. BiLSTM-TextCNN

Aim:

Input:

Output:

Model

Result

PS. :

6. Attention-BiLSTM-TextCNN

7. Conv-K-NRM

8. BiMPM + CNN

Thank

Files

README.md

Latest commit

History

README.md

File metadata and controls

ZJU Internship

Environment

1. Traditional RNN Frame

Input :

Output :

Model :

reference:

2. Multiple LSTM/GRU Frame

Input :

Output :

Model :

PS.:

reference:

3. TextCNN

Input :

Output :

Model :

PS.:

4. Attention mechanism for text classification tasks

Note:

Extend

5. BiLSTM-TextCNN

Aim:

Input:

Output:

Model

Result

PS. :

6. Attention-BiLSTM-TextCNN

7. Conv-K-NRM

8. BiMPM + CNN

Thank