You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comparazon is an open-source project aimed at scraping data from e-commerce websites like Amazon and Flipkart. It is designed to be used for price comparison and product research and is built with the Scrapy framework and Python.
The Scrapy framework, which is a powerful and versatile web crawling framework specifically designed for web scraping and data extraction, is at the heart of the project. The Python programming language is used to build the project, which is well-known for its simplicity, flexibility, and scalability.
Comparazon's mission is to provide users with a website for comparing data from e-commerce websites quickly and easily. The project is intended for a variety of uses, including price comparison and product research, both of which mean a lot to an ecommerce consumer.
Secure authentication for users to access the website.
Authorization system to manage user access.
Efficient rendering and fetching of mobile phones data from the MongoDB database.
Automated database update process through hourly scraping script.
User-friendly "Add to Favorites" feature for quick access.
Advanced filtering capabilities for better user experience.
Intuitive search feature to find desired content.
Comparison of the best deals from Amazon and Flipkart e-commerce websites.
Scraping
Scraping around 300 phones in a matter of seconds
Using that as a database scraping those products on Flipkart
Updating the remote database
Scraper runs every hour on a Google Cloud Platform Virtual Instance via a cronjob
Can Monitor the Scraping Jobs via a dashboard service
Implemented rotating proxies to avoid blocking of the spider
Can also use Selenium to scrape dynamic websites
🔮Future Scope
Extension for price comparison by checking the product opened in the user's browser
Sending notifications to user if price drop on favorite items
Scrape more ecommerce websites
Scrape more categories and check for data errors if any
## 💸Applications
Comparazon is an open-source project that provides a cutting-edge solution for data collection from e-commerce websites like Amazon and Flipkart. The Scrapy framework and Python programming language are used to build the project, which provides an easy-to-use API service to access the data scraped from these websites. The scraped data can be exported in json format, allowing users to use and analyse the data in their preferred format.
Comparazon provides e-commerce researchers, data analysts, and businesses with a powerful tool for collecting data from e-commerce websites in real-time. This enables them to make informed decisions about which products to buy and at what price, optimising their e-commerce strategy and increasing profits. The project intends to scrape and crawl data from more e-commerce websites in the future, as well as expand its capabilities to include more product categories such as laptops, accessories, footwear, and so on.
Comparazon has an easy-to-use user interface that allows users to interact with the collected data. The website is designed to be user-friendly and to allow users to access and analyse data in a meaningful way. This allows them to identify trends quickly, compare prices, and make informed purchasing decisions. Users can use Comparazon to take control of their e-commerce strategy and make the most of data collected from e-commerce websites.