Skip to content

Latest commit

 

History

History
 
 

DrQA

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

DrQA

A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA). The code is based on Runqi's implementation (https://github.com/hitvoice/DrQA).

Requirements

  • python >=3.5
  • pytorch 0.2.0
  • numpy
  • pandas
  • msgpack
  • spacy 1.x
  • cupy
  • pynvrtc

Quick Start

Setup

  • make sure python 3 and pip is installed.
  • install pytorch matched with your OS, python and cuda versions.
  • install the remaining requirements via pip install -r requirements.txt
  • download the SQuAD datafile, GloVe word vectors and Spacy English language models using bash download.sh.

Train

# prepare the data
python prepro.py

# make sure CUDA lib path can be found, e.g.:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64

# specify the path to find SRU implementation, e.g.:
export PYTHONPATH=../../sru/

# train for 50 epoches with batchsize 32
python train.py -e 50 -bs 32

Results

EM F1 Time used in RNN Total time/epoch
LSTM (original paper) 69.5 78.8 ~523s ~700s
SRU (this version) 70.3 79.5 ~88s ~200s

Tested on GeForce GTX 1070.

Credits

Author of the Document Reader model: Danqi Chen.

Author of the original Pytorch implementation: Runqi Yang.

Most of the pytorch model code is borrowed from Facebook/ParlAI under a BSD-3 license.