What are the most pressing concerns regarding ‘Climate Change’ among tweeters according to Topic Modeling?
In this project ‘Climate Change Tweets’ (Kaggle,2022) has been used. The file contains a list of top tweets containing the keyword ‘Climate Change’, comprising 9050 tweets and 11 columns with the titles UserScreenName, UserName, Timestamp, Text, Embedded_text, Emojis, Comments, Likes, Retweets, and Image links for the period 1/01/2022 through 19/07/2022.
- Objective : This paper explores the most concerning aspects of climate change based on a tweet dataset, analyzing whether the public is neglecting key threats while focusing on others, given the widespread impact of climate change on organizational, economic, and environmental levels.
- Procedure : Latent Dirichlet Allocation (LDA) is a generative probabilistic model used for topic modeling by assuming each document is a mixture of topics and each topic is a mixture of words. It involves dimensionality reduction and relies on parameters like alpha and beta to determine topic density and word distribution, with preprocessing quality and optimal topic number being key factors for meaningful results.
- Conclusion : The topic modeling analysis reveals that industrial and organizational aspects of climate change receive more focus than political or environmental concerns, with environmental issues being the least discussed. The key words derived mostly lack sentiment of concern, and future analysis with lexicon-based sentiment tools could help assess the subjectivity and polarity of these topics.