Skip to content

This project is a Python script that scrapes the top movies data from IMDb and performs data analysis and visualization on the scraped data. The script uses the requests, BeautifulSoup, pandas, seaborn, matplotlib, and plotly libraries to scrape, clean, and visualize the data.

Notifications You must be signed in to change notification settings

deepanshubaghel/Imdb-Top-Rated-Movies-Scrapping-And-Visualization

Repository files navigation

IMDB Top Movies Scraper and Data Analysis

This project is a Python script that scrapes the top movies data from IMDB's chart page and performs data analysis and visualization on the scraped data. The script uses the requests, BeautifulSoup, pandas, seaborn, matplotlib, and plotly libraries to scrape, clean, and visualize the data.

The script performs the following tasks:

Scrapes the top movies data from IMDB's chart page, including movie names, ratings, votings, and release years. Cleans and prepares the scraped data for analysis by converting data types, removing missing values, and formatting the data. Visualizes the distribution of ratings, votings, and release years using histograms and scatter plots. image Creates an interactive scatter plot of ratings vs. votings and a 3D scatter plot of ratings vs. movies vs. release year. image

image

image

The script also saves the scraped data to a CSV file for future use.

To run the script, simply execute the Python file. The script will print the status of the CSV file creation and display the visualizations.

This project can be useful for data analysis enthusiasts, students, or anyone interested in learning web scraping and data visualization using Python. The script can be modified to scrape and analyze data from other websites or to perform different types of data analysis.

To learn more about the libraries used in this project, refer to their official documentation:

requests: https://docs.python-requests.org/en/master/

BeautifulSoup: https://www.crummy.com/software/BeautifulSoup/bs4/doc/

pandas: https://pandas.pydata.org/docs/

seaborn: https://seaborn.pydata.org/

matplotlib: https://matplotlib.org/stable/contents.html

plotly: https://plotly.com/python/

About

This project is a Python script that scrapes the top movies data from IMDb and performs data analysis and visualization on the scraped data. The script uses the requests, BeautifulSoup, pandas, seaborn, matplotlib, and plotly libraries to scrape, clean, and visualize the data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published