-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Laundry update #105
base: master
Are you sure you want to change the base?
Laundry update #105
Conversation
ymann
commented
Mar 17, 2018
- Updated scraper to scrape homepage instead of different pages for each hall
- Added option to request multiple halls
- Removed broken code
penn/laundry.py
Outdated
detailed = [] | ||
|
||
rows = soup.find_all('tr') | ||
for row in rows: | ||
cols = row.find_all('td') | ||
if len(cols) > 1: | ||
if len(cols) == 1 and len(cols[0].find_all('center')) == 1 and len(cols[0].find_all('center')[0].find_all('h2')) == 1: # Title element |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this on 2 columns; enforce the 80 character limit for style purposes.
penn/laundry.py
Outdated
detailed = [] | ||
|
||
rows = soup.find_all('tr') | ||
for row in rows: | ||
cols = row.find_all('td') | ||
if len(cols) > 1: | ||
if len(cols) == 1 and len(cols[0].find_all('center')) == 1 and len(cols[0].find_all('center')[0].find_all('h2')) == 1: # Title element | ||
if(cols[0].find_all('center')[0].find_all('h2')[0].find_all('a')[0].getText() == hall): # Check if found correct hall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if-else clause can be shortened: found_hall =
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make some style changes with the 80 character limit.
Endpoint is still slow after only retrieving data once. Retrieving multiple halls is currently O(n^2), but could be refactored to be O(n). Since HTML parsing might be a relatively expensive operation, this might be why this endpoint is still slow. |