Skip to content

Najsztub/ScrapyDownload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automatic Scrapinghub data downloader

In the begining Scrapinghub alowed to automatically schedule jobs on the server side. Some time ago they changed their policy disabling automatic scheduling for their free plan. This short script alows for automatic scheduling of running the scraper as well as downloading the data afterwards from the server using Crontab and Python.

Also the data is uploaded afterwards into a MongoDB database.

This configuration is specific for my needs, but I think that it can be easily adopted.

Requirements

The script requires couple of environmental variables set:

  • SCH_API: Scrapinghub Api key
  • SCH_PROJECT: The project identifier
  • SCH_SPIDER_NAME: Name of the spider that is supposed to run

This piece of software is released under the MIT licence. Feel free to use.

About

Scrapinghub scheduler and download scripts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages