Skip to content

h2ws/python3-crawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Python3爬虫:爬取百度百科Python下100个页面的数据

python3.0 使用的模块:urllib、BeautifulSoup4、re

说明

爬取的数据会存储于生成的output.html文件中,使用浏览器可查看数据

修改spider_main.py文件count数值可以设置爬取数

``` bash if count == 100: break ```

运行步骤

IDE下运行spider_main (推荐使用pycharm)

About

Python3.5爬虫

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%