Youtube Comments Analyzer is a Python scripted tool to collect and analyze Youtube's videos comments (in Arabic). Tool provides the service of sentiment analysis and topics modeling based on arguments submitted by user. All fetched comments are saved in a MongoDB named "yt" inside a collection named "comments".
Sentiment analyzer is being trained using 1000 positive-labeled and another 1000 negative-labeled tweets with accuracy of ~88% based on 80% training and 20% test sets. Accuracy may found lower with text and comments analysis as a result of dialect phrases.
The technology of sentiment analysis is a part of artificial intelligence, and its research is very meaningful for obtaining the sentiment trend of the comments.
GloVe i.e. global vector has been used rather then the conventional method as it provides better results than word vectors.
- many_stop_words==0.2.2
- httplib2==0.11.3
- gensim==3.4.0
- Flask==1.0.2 (For web endpoint only)
- nltk==3.2.5
- google_api_python_client==1.6.7
While the accuracy is considered low because of dialect languages there is always a room for improvement. Therefore you can always add new positive and negative labeled data to pos.txt and neg.txt files respectively in order to improve results and cover more dialect phrases and words.