Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

发现一个目录读取bug #604

Open
ladia opened this issue Jul 27, 2022 · 3 comments
Open

发现一个目录读取bug #604

ladia opened this issue Jul 27, 2022 · 3 comments

Comments

@ladia
Copy link

ladia commented Jul 27, 2022

无论是2.0还是3.0 https://www.bzxsw.com 这个书源新更新的目录翻页有那么一点不一样 有一个跳转全部(实际是跳转第二页) 如果把翻页规则写成.btn-mulu@href||text.下一页@href 因为这个网站的下一页规则在实际已经到最后一页后还有一个下一页翻页,比如https://www.bzxsw.com/index/5983/4/ 会有一个下一页https://www.bzxsw.com/index/5983/5/(显示第一页目录)然后按照阅读的目录算法,去掉重复的目录,之后测试书源显示的第一章正文实际正文是41章开始,第1-40章会落在后面(3.0则是正文测试时就显示41章)
{
"bookSourceGroup": "",
"bookSourceName": "百书楼",
"bookSourceUrl": "https://www.bzxsw.com",
"enable": true,
"httpUserAgent": "",
"loginUrl": "",
"ruleBookAuthor": "tag.p.0@a@text",
"ruleBookContent": "id.htmlContent@html",
"ruleBookContentReplace": "",
"ruleBookKind": "##:cate.{15}(.?)小?说?"(?:[^\"]"){19}(连载|[^\"])[^\\d]([^\\s])##$1,$2,$3###",
"ruleBookLastChapter": ".red@text",
"ruleBookName": "h1@text",
"ruleBookUrlPattern": ".
/go/\d+/",
"ruleChapterList": ".mulu_list@li@a",
"ruleChapterName": "text",
"ruleChapterUrl": "",
"ruleChapterUrlNext": ".btn-mulu@href||text.下一页@href",
"ruleContentUrl": "href",
"ruleContentUrlNext": "##="(\/yuedu\/\d+\/\d+\d+.html)"[^>]>下一章##$1###",
"ruleCoverUrl": "tag.img.0@src",
"ruleFindUrl": "",
"ruleIntroduce": "id.intro@text",
"ruleSearchAuthor": ".sp_4@text",
"ruleSearchCoverUrl": "a!1@href\nvar a=result.match(/\/go\/(\d+)\//)[1];\nvar b=Math.floor(a/1000);\n"https://www.bzxsw.com/files/article/image/\"+b+\"/\"+a+\"/\"+a+\"s.jpg\"\n",
"ruleSearchIntroduce": "",
"ruleSearchKind": ".sp_6@text",
"ruleSearchLastChapter": "a!0@text",
"ruleSearchList": ".gx_cont@li!0",
"ruleSearchName": "a!1@text",
"ruleSearchNoteUrl": "a@href",
"ruleSearchUrl": "/search.html@s=searchKey",
"serialNumber": 0,
"weight": 0
}
翻页规则只有写.btn-mulu@href||option!0:1:2@value 才能加载完全目录,不然就会目录缺页少页 但是实际上比如https://www.bzxsw.com/index/5983/1/一页上option只有5个按理说只去掉目录原页和当前页写option!0:1@href就应该可以获取全剩下三页目录页,但是实际测试时就会缺少最后一页目录,不知道这是不是bug
以下是可以获取全目录的
{
"bookSourceGroup": "",
"bookSourceName": "百书楼",
"bookSourceUrl": "https://www.bzxsw.com",
"enable": true,
"httpUserAgent": "",
"loginUrl": "",
"ruleBookAuthor": "tag.p.0@a@text",
"ruleBookContent": "id.htmlContent@html",
"ruleBookContentReplace": "",
"ruleBookKind": "##:cate.{15}(.
?)小?说?"(?:[^\"]"){19}(连载|[^\"])[^\\d]([^\\s])##$1,$2,$3###",
"ruleBookLastChapter": ".red@text",
"ruleBookName": "h1@text",
"ruleBookUrlPattern": ".*/go/\d+/",
"ruleChapterList": ".mulu_list@li@a",
"ruleChapterName": "text",
"ruleChapterUrl": "",
"ruleChapterUrlNext": ".btn-mulu@href||option!0:1:2@value",
"ruleContentUrl": "href",
"ruleContentUrlNext": "##="(\/yuedu\/\d+\/\d+
\d+.html)"[^>]*>下一章##$1###",
"ruleCoverUrl": "tag.img.0@src",
"ruleFindUrl": "",
"ruleIntroduce": "id.intro@text",
"ruleSearchAuthor": ".sp_4@text",
"ruleSearchCoverUrl": "a!1@href\nvar a=result.match(/\/go\/(\d+)\//)[1];\nvar b=Math.floor(a/1000);\n"https://www.bzxsw.com/files/article/image/\"+b+\"/\"+a+\"/\"+a+\"s.jpg\"\n",
"ruleSearchIntroduce": "",
"ruleSearchKind": ".sp_6@text",
"ruleSearchLastChapter": "a!0@text",
"ruleSearchList": ".gx_cont@li!0",
"ruleSearchName": "a!1@text",
"ruleSearchNoteUrl": "a@href",
"ruleSearchUrl": "/search.html@s=searchKey",
"serialNumber": 0,
"weight": 0
}

@ladia
Copy link
Author

ladia commented Jul 27, 2022

后面提到的比如 https://www.bzxsw.com/go/5983/ 只有后面写option!0:1:2才能获取全目录,再比如https://www.bzxsw.com/go/77/ 就只能填option!0:1才能获取全部目录页,这两个交换彼此都会缺少最后一页目录页。。。就很奇怪

@ladia
Copy link
Author

ladia commented Jul 27, 2022

用自带的简化写法有bug2.0 3.0都是 用xpath写翻页规则//*[@Class="btn-mulu"]/@href||//option[contains(@selected,"selected")]/following-sibling::option/@value 才能都加载全

1 similar comment
@ladia
Copy link
Author

ladia commented Jul 27, 2022

用自带的简化写法有bug2.0 3.0都是 用xpath写翻页规则//*[@Class="btn-mulu"]/@href||//option[contains(@selected,"selected")]/following-sibling::option/@value 才能都加载全

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant