Webscraping Top 500 book details from goodreads using Python

With BeautifulSoup and Requests

Goodreads.com is a comprehensive list of top-rated books, as voted on by the general Goodreads community.

We will use Python, BeautifulSoup and Requests to scrape first 5 pages and create list of top 500 books and some interesting information on them.

Outline of the Project:

Exploration and scrapping information from 1 page
Download a single page from goodread.com and store it
Scrape the stored page, and extract the required data from the page with BeautifulSoup

Create a dictionary to store the book information
Write separate functions to scrape a particular information from the BeautifulSoup document, and add it to the dictionary
Repeat this for any number of pages by appending new items to the dictionary
Store this in a Panda detaframe
Save dataframe to a csv file

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
webscraping-goodreads.ipynb		webscraping-goodreads.ipynb