You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Blank strings/ strings that contain none of the feature words are currently handled differently by the three closed-ended classifiers.
For form and target, such strings are predicted to belong to the most common class in the training set (rally/ demonstration and domestic government, respectively). For issue, they are classified as none, which is not the most common class in the training set.
See page 19 of Alex's thesis chapter 2, and the following example code
import pandas as pd
from mpeds.classify_protest import MPEDS
test_classifier = MPEDS()
test_data = pd.Series(['', 'avocados and grapefruits'])
test_classifier.getIssue(test_data)
test_classifier.getForm(test_data)
test_classifier.getTarget(test_data)
The text was updated successfully, but these errors were encountered:
erleholgersen
changed the title
Closed-ended classifiers: Inconsistency in handing blank strings
Closed-ended classifiers: Inconsistency in handling blank strings
May 30, 2017
Oh, that seems bizarre. We should probably return a Nonetype if this is the case and throw a warning that says something like "No words found in vectorizer."
Blank strings/ strings that contain none of the feature words are currently handled differently by the three closed-ended classifiers.
For form and target, such strings are predicted to belong to the most common class in the training set (rally/ demonstration and domestic government, respectively). For issue, they are classified as none, which is not the most common class in the training set.
See page 19 of Alex's thesis chapter 2, and the following example code
The text was updated successfully, but these errors were encountered: