Experiments

Please see the corpus website for an introduction to this data set.

Requirements

The version numbers in parantheses indicate the version we used, other versions may or may not work.

We recommend a miniconda environment.

python (3.5.2)
numpy (1.12.0)
scipy (0.18.1)
gensim (0.13.4.1)
scikit-learn (0.18.1)
tensorflow (1.0.0)
matplotlib (2.0.0)

To obtain reproducible results, parallel execution is disabled at several points in the code. This means things could run quite a bit faster, but would not result in the exact same results, which is now the case, with the exception of the LSTM part.

Total duration: about 4.5 hours on a machine with the following specs:

Intel Core i7-6900K, 8x 3.20GHz
64 GB RAM (4x16GB DDR4-2133)
1TB SSD
NVIDIA Titan X (Pascal)

Running the Experiments

To run everything, simply execute

./run.sh

The script will ask you if you want to download the corpus (requiring wget or curl and bzip2).

If you are interested only in certain parts, uncomment what you don't need in run.sh and in src/run_evaluation.py (in particular, you can comment out entries from methodmodules if you want to run only certain methods).

The code produces some output in the directories logs, plots and tables. We have included our results here, so you can see what to expect.

The code also creates a directory models, which will be about 500 MB in size. The entire experiments folder, including the downloaded corpus, will be almost 1GB in size.

Results

Category	Measure	BOW	MNB	NBSVM	BOCID	D2V	LSTM
Negative Sentiment	Precision	0.5521	0.5637	0.5660	0.5345	0.5842	0.5349
	Recall	0.5109	0.4867	0.4512	0.5452	0.5624	0.7197
	F1	0.5307	0.5224	0.5021	0.5398	0.5731	0.6137
Positive Sentiment	Precision	0.1000	0.0000	0.2353	0.0662	0.0397	0.0000
	Recall	0.0698	0.0000	0.0930	0.2093	0.4651	0.0000
	F1	0.0822	0.0000	0.1333	0.1006	0.0731	0.0000
Off-topic	Precision	0.2754	0.6190	0.3969	0.2252	0.2065	0.2742
	Recall	0.2379	0.0224	0.1328	0.5121	0.6241	0.2638
	F1	0.2553	0.0433	0.1990	0.3128	0.3103	0.2689
Inappropriate	Precision	0.1627	0.0000	0.1765	0.1516	0.1340	0.1964
	Recall	0.1122	0.0000	0.0495	0.3993	0.5776	0.1089
	F1	0.1328	0.0000	0.0773	0.2198	0.2175	0.1401
Discriminating	Precision	0.1847	0.0000	0.2683	0.1301	0.1111	0.1136
	Recall	0.1028	0.0000	0.0780	0.2943	0.3936	0.1418
	F1	0.1321	0.0000	0.1209	0.1804	0.1733	0.1262
Feedback	Precision	0.6554	0.7465	0.7356	0.5094	0.5240	0.6307
	Recall	0.5803	0.4074	0.5219	0.6879	0.7056	0.6287
	F1	0.6156	0.5271	0.6106	0.5853	0.6014	0.6297
Personal Stories	Precision	0.6981	0.5491	0.6916	0.5762	0.6247	0.6380
	Recall	0.5920	0.4578	0.4788	0.7120	0.8123	0.6658
	F1	0.6407	0.4993	0.5658	0.6369	0.7063	0.6516
Arguments Used	Precision	0.6105	0.5086	0.6064	0.5642	0.5657	0.5685
	Recall	0.5215	0.3170	0.4628	0.6106	0.6614	0.6458
	F1	0.5625	0.3906	0.5250	0.5865	0.6098	0.6047
Wins	Precision	2	2	2	0	1	1
	Recall	0	0	0	0	7	1
	F1	0	0	1	3	2	2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Experiments

Requirements

Running the Experiments

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Experiments

Requirements

Running the Experiments

Results