-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MarleySpoon: add precautionary check for unexpected API URLs. #1069
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
5518ae2
MarleySpoon: add precautionary check for unexpected API URLs.
jayaddison 06e5bf8
Fixup: linting: remove unused variable.
jayaddison cf9c059
Fixup: linting: use isort to re-order imports.
jayaddison c25b0e3
Fixup: linting: apply pyupgrade (py3.8+) to test module.
jayaddison 5f2a6bd
MarleySpoon: remove use of variable shadowing that introduce a change…
jayaddison 39cc788
MarleySpoon: tests: rename test case.
jayaddison ca2154f
MarleySpoon: tests: add coverage relative-URL API host case.
jayaddison c24cc7b
MarleySpoon: tests: brevity: rename 'valid_url' to 'url'.
jayaddison eb286cb
MarleySpoon: adjustment: use is-same-scraper condition to decide whet…
jayaddison b06eec9
MarleySpoon: exception handling: include link from raised-exception t…
jayaddison 9c94ee9
MarleySpoon: fixup: add missing SCRAPERS import (localised; not ideal…
jayaddison 7561de5
MarleySpoon: reduce constraint: allow less-precise matches on partial…
jayaddison 2a5e003
MarleySpoon: linting: adjust code to comply with black code style rec…
jayaddison ff02a0c
MarleySpoon: refactor: adjust domain-climbing logic.
jayaddison 1dfd79b
MarleySpoon: cleanup: remove unused import.
jayaddison File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<script> | ||
gon.current_brand="test_invalid"; gon.current_country="XX"; gon.api_token=" ".trim() || null; gon.api_host="http://api.marlarkey.invalid"; | ||
</script> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<script> | ||
gon.current_brand="test_invalid"; gon.current_country="XX"; gon.api_token=" ".trim() || null; gon.api_host="relative_path/unexpected.js"; | ||
</script> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import unittest | ||
|
||
import responses | ||
|
||
from recipe_scrapers._exceptions import RecipeScrapersExceptions | ||
from recipe_scrapers.marleyspoon import MarleySpoon | ||
|
||
|
||
class TestFaultyAPIURLResponse(unittest.TestCase): | ||
|
||
@responses.activate | ||
def test_faulty_response(self): | ||
url = "https://marleyspoon.de/menu/113813-glasierte-veggie-burger-mit-roestkartoffeln-und-apfel-gurken-salat" | ||
with open("tests/legacy/test_data/faulty.testhtml") as faulty_data: | ||
faulty_response = faulty_data.read() | ||
|
||
responses.add( | ||
method=responses.GET, | ||
url=url, | ||
body=faulty_response, | ||
) | ||
|
||
with self.assertRaises(RecipeScrapersExceptions): | ||
MarleySpoon(url=url) | ||
|
||
@responses.activate | ||
def test_relative_api_url(self): | ||
url = "https://marleyspoon.de/menu/113813-glasierte-veggie-burger-mit-roestkartoffeln-und-apfel-gurken-salat" | ||
with open("tests/legacy/test_data/relative_url.testhtml") as relative_url_data: | ||
relative_url_response = relative_url_data.read() | ||
|
||
responses.add( | ||
method=responses.GET, | ||
url=url, | ||
body=relative_url_response, | ||
) | ||
|
||
with self.assertRaises(Exception): | ||
MarleySpoon( | ||
url=url | ||
) # currently this raises an requests.exceptions.MissingSchema exception |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My attempt to translate this code into a natural-language description:
When scraping a website, ensure that any additional page requests are to hosts that belong to the set of domains supported by the scraper and its subclasses.