-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed scheduler process_spider_output() to yield requests #254
Conversation
Codecov Report
@@ Coverage Diff @@
## master #254 +/- ##
=======================================
Coverage 70.15% 70.15%
=======================================
Files 68 68
Lines 4715 4715
Branches 632 632
=======================================
Hits 3308 3308
Misses 1267 1267
Partials 140 140
Continue to review full report at Codecov.
|
Hey @voith sorry for a long absence. It looks absolutely fine. I hope your complex tests work. Ready for merge? |
Hi @sibiryakov, This is ready for merge. You can view that the test works by looking at the builds. |
thank you! 🍺 |
This PR broke Frontera behaviour. Now every yielded request end up in as a call to |
Ping @sibiryakov |
@isra17 I too had noticed that Well frontera should have had a test case for this. |
I've opened a PR to revert this change #273 |
Don't worry about that, this is not the kind of issue that broke vanilla Frontera in obvious manner. I didn't see it until I had a middleware with some logic specific to links_extracted. |
this PR fixes the problem probably #261, |
There is no need to revert this change, it's a step in the right direction: why should allow other middlewares to operate on objects passed Frontera middlewares. |
Unless I'm missing something, won't #261 end up scheduling twice the requests? |
@isra17 This will happen only if you have some middleware in Scrapy yielding all requests it gets. Normally, this shouldn't happen. |
@isra17 I spend more time looking into this, and I think you're right: we will get requests in two places: temp. queue and frontier. I'll release the fix soon. |
See #276 |
fixes #253
Here's a screenshot using the same code discussed here.
Nothing seems to break when testing this change manually. The only test that was failing was wrong IMO because it passed a list of requests and items and was only expecting items in return. I have modified that test to make it compatible with this patch.
I've the split this PR into three commits:
A note about the tests added:
The tests might be a little difficult to understand on the first sight. I would recommend to read the following code in order understand the tests:I have simulated the above discussed code in order to write the test.