This project is for the Helsinki University Interactive data visualization course.
The visualizations are based on the Reuters Corpora document topics in the CD 2. The articles are from year 1997. I was inspired to explore the data more thoroughly with visualizations while I was using the corpus in developing a document classifier.
I did data cleaning, exploration data analysis in Jupyter noteboooks. In some of the notebooks I also tried out different visualization techniques and ideas. Some parts of the notebooks are quite verbose and not that edited. In them I try in a sketchy manner to find out what might work and what does not. The parts that I decided to keep, are used in the Dash application.
The online version of the application can be found in Heroku.
To locally run the application, download the repository and install the dependencies. After that you can start the application in the main folder:
python app.py
The project is described in a more detailed manner in the four learning diaries that cover the four week period of the project.