Skip to content
forked from zhuyingda/webster

a reliable high-level web crawling & scraping framework for Node.js.

License

Notifications You must be signed in to change notification settings

kelaicai/webster

 
 

Repository files navigation

Webster

Financial Contributors on Open Collective npm version Build Status

Overview

Webster is A Powerful and Extensible Web Crawling Framework for Node.js application. You can use Webster to crawl websites and extract structured data from their pages.

Which is different from other crawling framework is that Webster can scrape the content which rendered by browser client side javascript and ajax request.

Docker quick start

pull the example docker image:

docker pull zhuyingda/webster-demo
docker run -it zhuyingda/webster-demo

here is a simple demo for crawler about Baidu search result web page:

node demo_producer.js
env MOD=debug node demo_consumer.js

Requirements

  • Node.js 10.x+, redis
  • Works on Linux, Mac OSX

Or you can deploy on Docker.

Install

npm install webster

Architecture overview

Documentation

You can see more details from here.

Contributors

Code Contributors

This project exists thanks to all the people who contribute. [Contribute].

Financial Contributors

Become a financial contributor and help us sustain our community. [Contribute]

Individuals

Organizations

Support this project with your organization. Your logo will show up here with a link to your website. [Contribute]

License

GPL-V3

Copyright (c) 2017-present, Yingda (Sugar) Zhu

About

a reliable high-level web crawling & scraping framework for Node.js.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 97.4%
  • HTML 2.6%