Skip to content

Latest commit

 

History

History
45 lines (35 loc) · 1.02 KB

README.md

File metadata and controls

45 lines (35 loc) · 1.02 KB

Evaluating Comments for Constructiveness

This work was done as a part of CS685.
Author: Daivik Swarup

Download data from here

Preprocessing

Split data into train, test, val splits:

python preprocess.py

For classification, create thresholded text files:

python preprocess_threshold <PATH-TO-TRAIN-DIR> train_80_20.txt   
python preprocess_threshold <PATH-TO-VAL-DIR> val_80_20.txt   
python preprocess_threshold <PATH-TO-TEST-DIR> test_80_20.txt   

Train classifiers

python binary_classification.py <VECTORIZER> output.pkl

can be one of {'tfidf', 'count', 'tfidf_length', 'count_length', 'bert'}

For lstm:

python train_lstm.py

Train rankers

python train_ranknet.py <VECTORIZER> model.pt

can be one of {'tfidf', 'count', 'tfidf_length', 'count_length', 'bert'}

For lstm:

python train_ranknet_lstm.py

Misc

Scripts in the misc directory are self explanatory.