EcomScraper

EcomScraper is a Python project designed for scraping data from various e-commerce platforms like Shopify, Wix, WooCommerce, and more. This project is structured into multiple modules to organize the scrapers, database management, utilities, and configuration files.

Project Structure

EcomScraper/
├── custom_scrappers/
│   ├── bobbi_brown_scrapper.py
│   └── scraper.py
├── db/
│   ├── database.py
│   └── images/
├── shopify_scraper/
│   └── scraper.py
├── utils/
│   └── utils.py
├── wix_scrapper/
│   └── scraper.py
├── woocommerce_scrapper/
│   └── scraper.py
├── .env
├── .env.sample
├── .gitignore
├── config.py
├── database.sqlite3
├── main.py
└── requirements.txt

Folders and Files

1. `custom_scrappers/`

This folder contains custom scrapers for specific websites.

bobbi_brown_scrapper.py: A scraper tailored for the Bobbi Brown e-commerce site.
scraper.py: General scraper script for custom websites.

2. `db/`

Database management module for handling and storing scraped data.

database.py: Database connection and query functions.
images/: Directory to store downloaded product images.

3. `shopify_scraper/`

Contains scripts specific to scraping Shopify-based websites.

scraper.py: Script for scraping Shopify products, categories, etc.

4. `utils/`

Utility functions used across the project for various helper tasks.

utils.py: General utility functions to support scraping tasks.

5. `wix_scrapper/`

Module designed for scraping data from Wix-based websites.

scraper.py: Wix scraper for extracting products and related data.

6. `woocommerce_scrapper/`

Contains scripts to scrape data from WooCommerce-based websites.

scraper.py: WooCommerce scraper script.

7. `main.py`

Main entry point of the project that initiates the scraping process. It includes the following key components:

initialize_db(): Initializes the database for storing scraped data.
process_website(): Determines the platform type (Shopify, WooCommerce, Wix, or custom) and calls the appropriate scraper function.
Uses concurrent.futures for parallel processing of multiple websites.

8. `config.py`

Configuration file that defines the list of websites to scrape and the database path.

WEBSITES: List of URLs to be scraped. Add or remove URLs in this list as required.
DB_PATH: Path to the SQLite database. The path is fetched from the environment variable SQLITE_DB_PATH.

9. `.env` & `.env.sample`

Environment configuration files:

.env: Contains sensitive environment variables (e.g., SQLITE_DB_PATH for database path).
.env.sample: Sample environment file to set up necessary variables.

10. `requirements.txt`

File listing all the required Python packages to run the project.

Setup Instructions

Clone the repository:

git clone https://github.com/yourusername/EcomScraper.git

Navigate to the project directory:
```
cd EcomScraper
```
Install the dependencies:
```
pip install -r requirements.txt
```
Set up environment variables:
- Rename .env.sample to .env.
- In the .env file, define SQLITE_DB_PATH with the path to your SQLite database.
Update config.py:
- Add URLs of e-commerce websites to the WEBSITES list.
Run the main script to start scraping:
```
python main.py
```

Usage

To initiate scraping, simply run main.py. The script will check each website’s platform and use the appropriate scraper function. The scraping progress and any errors will be displayed in the console output.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EcomScraper

Project Structure

Folders and Files

1. `custom_scrappers/`

2. `db/`

3. `shopify_scraper/`

4. `utils/`

5. `wix_scrapper/`

6. `woocommerce_scrapper/`

7. `main.py`

8. `config.py`

9. `.env` & `.env.sample`

10. `requirements.txt`

Setup Instructions

Usage

License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
custom_scrappers		custom_scrappers
db		db
shopify_scraper		shopify_scraper
utils		utils
wix_scrapper		wix_scrapper
woocommerce_scrapper		woocommerce_scrapper
.env.sample		.env.sample
.gitignore		.gitignore
config.py		config.py
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

rasheed-aidetic/EcomScraper

Folders and files

Latest commit

History

Repository files navigation

EcomScraper

Project Structure

Folders and Files

1. custom_scrappers/

2. db/

3. shopify_scraper/

4. utils/

5. wix_scrapper/

6. woocommerce_scrapper/

7. main.py

8. config.py

9. .env & .env.sample

10. requirements.txt

Setup Instructions

Usage

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

1. `custom_scrappers/`

2. `db/`

3. `shopify_scraper/`

4. `utils/`

5. `wix_scrapper/`

6. `woocommerce_scrapper/`

7. `main.py`

8. `config.py`

9. `.env` & `.env.sample`

10. `requirements.txt`

Packages