-
Notifications
You must be signed in to change notification settings - Fork 190
/
Observed
117 lines (82 loc) · 2.34 KB
/
Observed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
###
All results on test data set of IMDb movie reviews
###
0. Positive / Negative Word list - simple counting
True Positives: 11780
False Positives: 9665
False Negatives: 720
Precision: 0.549312
Recall: 0.942400
F1 Score: 0.694064
1. Without Negations, NB, Trained on movie reviews
True Positives: 8001
False Positives: 4589
False Negatives: 4499
Precision: 0.635504
Recall: 0.640080
F1 Score: 0.637784
2. With Negations, NB, Trained on movie reviews
True Positives: 10630
False Positives: 2537
False Negatives: 1870
Precision: 0.807321
Recall: 0.850400
F1 Score: 0.828301
3. Reduced training set (500 reviews)
True Positives: 7294
False Positives: 1635
False Negatives: 5206
Precision: 0.816889
Recall: 0.583520
F1 Score: 0.680760
4. Reduced training set (random samples)
True Positives: 11778
False Positives: 7497
False Negatives: 722
Precision: 0.611051
Recall: 0.942240
F1 Score: 0.741338
5. Reduced training set (1000 random samples)
True Positives: 9921
False Positives: 2837
False Negatives: 2579
Precision: 0.777630
Recall: 0.793680
F1 Score: 0.785573
6. Weights (1, 2, 3) + Random samples
True Positives: 8807
False Positives: 1435
False Negatives: 3693
Precision: 0.859891
Recall: 0.704560
F1 Score: 0.774514
7. Weights (6, 12, 18) + Random samples
True Positives: 8680
False Positives: 1491
False Negatives: 3820
Precision: 0.853407
Recall: 0.694400
F1 Score: 0.765736
8. After fine tuning
Precision : 83.66
Recall: 83.64
Accuracy: 83.65
9. SVM with linear kernel
Done! 988618 iterations
CPU Time: 00:41:37
Accuracy: 86.26400% (21566/25000)
Precision: 86.85366% (10683/12300)
Recall: 85.46400% (10683/12500)
System/Answer p/p p/n n/p n/n: 10683 1617 1817 10883
10. Mutual Information, Naive Bayes (1000 examples)
Accuracy - 85.9% at 6000 features, exponential position weighting 1 to e
Accuracy - 86% at 5200 features, exponential position weighting 1 to e**.7
Accuracy - 86.3% at 5600 features, no position weighting
Accuracy - 86.7% at 12000 features with bigrams
Accuracy - 87.5% at 19000 features with bigrams
11. Mutual Information, Naive Bayes (full set)
Accuracy - 84.9% at 6500 features, no weighting.
Accuracy - 84.672% at 7000 features, exponential position weighting 1 to e**.7
Accuracy - 87.66% at 16000 features with bigrams
Accuracy - 88.34% aat 26000 features with trigrams
Accuracy - 88.80% aat 32000 features with trigrams