Yelp-Crawler

This script can scrape information, related to different businesses, listed on Yelp website.

Crawler accepts as an input:

Category name, ie - contractors
Location, ie - San Francisco, CA

It returns a file with json objects, each json representing a business from the given search results.

Each business has the following data:

Business name
Business rating
Number of reviews
Business yelp url
Business website
List of first 5 reviews, for each review:
- Reviewer name
- Reviewer location
- Review date

Installation

1. Get API key on this page https://fusion.yelp.com/
2. Create .env file in project directory and put there your API_KEY in next format:
API_KEY=YOUR_API_KEY
3. python -m venv venv
4. source venv/bin/activate (Linux and macOS) or venv\Scripts\activate (Windows)
5. pip install -r requirements.txt
6. run main.py

Please pay attention

The script can work without a proxy, but sometimes it does not parse all the necessary information due to blocking. If you have access to premium proxies, you can increase the quality of parsing.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.env.sample		.env.sample
.gitignore		.gitignore
README.md		README.md
businesses.json		businesses.json
failed_to_parse_list_of_reviews.txt		failed_to_parse_list_of_reviews.txt
main.py		main.py
proxy_list.txt		proxy_list.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yelp-Crawler

Crawler accepts as an input:

Each business has the following data:

Installation

Please pay attention

About

Releases

Packages

Languages

luxuriant777/Yelp-Crawler

Folders and files

Latest commit

History

Repository files navigation

Yelp-Crawler

Crawler accepts as an input:

Each business has the following data:

Installation

Please pay attention

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages