update README.md

topiccrawler · Oct 25, 2019 · c9b5c52 · c9b5c52
1 parent ebced74
commit c9b5c52
Show file tree

Hide file tree

Showing 2 changed files with 18 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -4,11 +4,15 @@
 
 **2019/10/24 更新：加入哔哩哔哩相簿**
 
-使用 Scrapy 写成的 JK 爬虫，图片源自哔哩哔哩、Tumblr、Instagram，以及微博、Twitter (待完成)
+使用 Scrapy 写成的 JK 爬虫，图片源自哔哩哔哩、Tumblr、Instagram，以及微博、Twitter
 
-启动爬虫：
+## 安装依赖
 
-在 Windows 上，需要在 PowerShell 中执行以下命令
+`pip install -r requirements.txt`
+
+## 启动爬虫
+
+*在 Windows 上需要在 PowerShell 中执行以下命令*
 
 ```shell script
 scrapy crawl api.vc.bilibili -o data/api.vc.bilibili.jsonlines
@@ -20,3 +24,5 @@ scrapy crawl sscat-xyz.tumblr -o data/sscat-xyz.tumblr.jsonlines
 ```
 
 若要在下一次启动爬虫时恢复工作进度，则需要在命令后面加上 `-s JOBDIR=crawls/{spider_name}`
+
+下载的图片在 `data/full`，相关信息在 `{spider_name}.jsonlines` 里
diff --git a/genREADME.py b/genREADME.py
@@ -5,11 +5,15 @@ def genREADME():
     head = [
         '# jkcrawler',
         '',
-        '使用 Scrapy 写成的 JK 爬虫，图片源自哔哩哔哩、Tumblr、Instagram，以及微博、Twitter (待完成)',
+        '使用 Scrapy 写成的 JK 爬虫，图片源自哔哩哔哩、Tumblr、Instagram，以及微博、Twitter',
         '',
-        '启动爬虫：',
+        '## 安装依赖',
         '',
-        '在 Windows 上，需要在 PowerShell 中执行以下命令',
+        '`pip install -r requirements.txt`',
+        '',
+        '## 启动爬虫',
+        '',
+        '*在 Windows 上需要在 PowerShell 中执行以下命令*',
         '',
         '```shell script',
     ]
@@ -33,6 +37,8 @@ def genREADME():
         '',
         '若要在下一次启动爬虫时恢复工作进度，则需要在命令后面加上 `-s JOBDIR=crawls/{spider_name}`',
         '',
+        '下载的图片在 `data/full`，相关信息在 `{spider_name}.jsonlines` 里',
+        '',
     ]
 
     with open('README.md', 'w') as f: