Policy Gradient
- To run policy gradient: python3 rl(rnnorbert)+rnn.py
- The main algorithm is in pg.py
Q Learning
- To run Q learning: python3 run_q.py
- The main algorithm is in q_seq2seq.py
Imitation Learning (i.e No teacher forcing) or Supervised Learning (i.e. Teacher forcing)
- Run (rnnorbert)+rnn.py
Any file with (rnnorbert) in it allows you to choose either a Bert or an RNN-based Encoder.
Data
- To parse sgml file csv, refer to sgml2csv.py