Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 2.19 KB

README.md

File metadata and controls

33 lines (24 loc) · 2.19 KB

Entity Tracking Improves Cloze-style Reading Comprehension

PyTorch implementation of the paper "Entity Tracking Improves Cloze-style Reading Comprehension" (credit: swiseman).

Tested environment:

TLDR:

After setting up the environment, one can directly run train.bash to:

  1. Download data
  2. Preprocess datasets into hdf5 format
  3. Train a model on the Lambada dataset

Data

The relevant datasets can be downloaded from here, which include:

Preprocessing

Preprocess the raw train/test/validation text files into .hdf5 files using:

python preprocess.py --data <base_dir> --glove <path_to_glove.6B.100d.txt> --train <train_file> --valid <valid_file> --test <test_file> --std_feats --ent_feats --disc_feats --speaker_feats --out_file <out_hdf5_file>

Training

A sample training script is given below. For more information, view train.py for details on parameter settings and descriptions.

python train.py -cuda -datafile <model_file> -save <output_save_model_file.t7> -dropout 0.2 -bsz 64 -epochs 5 -rnn_size 100 -max_entities 5 -max_mentions 2 -clip 10 -beta1 0.7 -mt_coeff 1.5 -emb_size 100 -std_feats -speaker_feats -maxseqlen 1024 -mt_loss idx-loss -log_interval 1000

License

MIT