This is a final group project created for the class CSCI 4502/5502 "Data Mining" at University of Colorado, Boulder.
Collaborators: Isabel Beaulieu, Isabel Eskay, Michelle Ramsahoye, Erin Richardson, Taisiia Sherstiukova
Topic: In this project, we aim to classify strings of text into predetermined emotion categories. We will use publicly available data from Kaggle (link found here: Kaggle Emotions Dataset for NLP) and measure success by classification accuracy. To assess the utility of the emotion labels in the dataset, we will perform a variety of clustering techniques on the unlabeled data and compare the cluters to the provided labels. This work will provide insights into typical emotions present in text strings and their salience, the validity of various emotion labels, and the algorithms best-suited for classification of individual emotions.