by https://github.com/nifedara
- Moving to a new place can be daunting, so we often desire to know the condition of the place before there. This can guide our decision and evaluate things to prepare for.
- Local quality of life and liveability of our environment has been a major concern of late. And we need Policymakers to make urgent decisions that will improve citizens living conditions. But how do we measure the quality of our environment in a country where getting data can be very tedious?
- extract Wikipedia articles about some Nigerian states
- do a liveability ranking of the states based on the characteristics of the articles and across 4 indicators(Education, Environment, Infrastructure and Health).
- determine whether Wikipedia articles can be used to estimate the development of regions of the world where the United Nations Sustainable Development Goals are hard to track due to a lack of data.
- Pandas - for data cleaning/manipulation
- spaCy and NLTK (Natural Language Toolkit) - for text analysis
- Sklearn - for machine learning
- TextBlob - for sentiment analysis
- MatPlotlib - for visualization
- WordCloud - for Word Cloud visualization
- Geopandas - for working with geospatial data
: Where are the worst and most liveable places in Nigeria?
To answer this, I carried out an analysis of over 55474 Wikipedia articles that mention Nigeria.
The approach used in this analysis to find out the liveability of states in Nigeria by mining Wikipedia is categorized into the following phases;
- Getting Data - Wikipedia Extraction (XML Parsing)
- Named Entity Recognition
- Machine Learning Classification
- NLP Pre-processing
- Sentiment Analysis
- Liveability Scoring and Ranking
From the results, the following deductions were made:
- Of the 10 Nigerian states considered, Lagos state ranked as the most liveable state.
- Imo state had the highest score in the infrastructure liveability category. And ranked the second most liveable state.
- Kano state ranked as the least liveable state of the 10 states considered.
- Kaduna state is at the bottom of the rank, just a position above Kano
80% Similarity of the liveabilty ranking result to the 2019 HDI(Human Development Index) Ranking of Nigerian states.
Comparing the 2019 HDI(Human Development Index) Ranking of Nigerian states (list by Radboud University) to my liveability ranking result, there is a close similarity. In my liveability ranking, Lagos ranked the highest, and Kaduna and Kano ranked the least, just like in this HDI ranking
Wikipedia articles can be used to estimate the development of regions of the world, like Nigeria where the United Nations Sustainable Development Goals are hard to track due to a lack of data.
- The output can be used in recommendation systems.
- Can also be used in reports of our progress in achieving SDGs
The data used can be gotten by mining Wikipedia articles.