Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horoscope plugin's HTML scraping no longer matches the page we scrape #199

Open
nasonfish opened this issue Oct 15, 2015 · 4 comments
Open

Comments

@nasonfish
Copy link
Member

We HTML scrape from a site and that site changed their HTML such that the fields we were using previously no longer match the classes we use in horoscope.py, resulting in us not being able to find a sign and returning an error no matter what.

@nasonfish nasonfish changed the title Horoscope plugin doesn't work Horoscope plugin's HTML scraping no longer matches the page we scrape Oct 15, 2015
@edwardslabs
Copy link
Member

I think we should switch away from html scraping if possible. It looks like there are a few api's available this was the top google hit and it it seems like it could work: https://github.com/tapasweni-pathak/Horoscope-API

@dmptrluke
Copy link
Member

Thats a web frontend for https://testpypi.python.org/pypi/horoscope, which
rips from Ganeshaspeaks... sooo, not really better.

On Fri, Oct 16, 2015 at 1:24 AM, Andy Edwards [email protected]
wrote:

I think we should switch away from html scraping if possible. It looks
like there are a few api's available this was the top google hit and it it
seems like it could work:
https://github.com/tapasweni-pathak/Horoscope-API


Reply to this email directly or view it on GitHub
#199 (comment)
.

@edwardslabs
Copy link
Member

If there is no free api maybe horoscope gets dropped since maintaining an HTML scraping plugin can be pretty burdensome.

@nasonfish
Copy link
Member Author

For what it's worth, here's an updated version of the plugin, but where to go from here is debatable, if we should just keep supporting this site or not.

# Plugin by Infinity - <https://github.com/infinitylabs/UguuBot>

import requests
from bs4 import BeautifulSoup

from cloudbot import hook
from cloudbot.util import formatting


@hook.on_start()
def init(db):
    db.execute("create table if not exists horoscope(nick primary key, sign)")
    db.commit()


@hook.command(autohelp=False)
def horoscope(text, db, bot, notice, nick):
    """<sign> - get your horoscope"""

    headers = {'User-Agent': bot.user_agent}

    # check if the user asked us not to save his details
    dontsave = text.endswith(" dontsave")
    if dontsave:
        sign = text[:-9].strip().lower()
    else:
        sign = text

    db.execute("create table if not exists horoscope(nick primary key, sign)")

    if not sign:
        sign = db.execute("select sign from horoscope where "
                          "nick=lower(:nick)", {'nick': nick}).fetchone()
        if not sign:
            notice("horoscope <sign> -- Get your horoscope")
            return
        sign = sign[0]

    url = "http://my.horoscope.com/astrology/free-daily-horoscope-{}.html".format(sign)

    try:
        request = requests.get(url, headers=headers)
        request.raise_for_status()
    except (requests.exceptions.HTTPError, requests.exceptions.ConnectionError) as e:
        return "Could not get horoscope: {}.".format(e)

    soup = BeautifulSoup(request.text)

    title = soup.find_all('h1', {'class': 'f40'})
    if not title:
        return "Could not get the horoscope for {}.".format(text)

    title = title[0].text.strip()
    horoscope_text = soup.find('div', {'class': 'block-horoscope-text'}).text.strip()
    result = "\x02{}\x02 {}".format(title, horoscope_text)
    result = formatting.strip_html(result)

    if text and not dontsave:
        db.execute("insert or replace into horoscope(nick, sign) values (:nick, :sign)",
                   {'nick': nick.lower(), 'sign': sign})
        db.commit()

    return result

linuxdaemon pushed a commit to linuxdaemon/CloudBot that referenced this issue Nov 14, 2017
Switch all table creation to new sqlalchemy table structure
linuxdaemon pushed a commit to linuxdaemon/CloudBot that referenced this issue Feb 9, 2018
…stock-plugin

Rewrite stock.py to use AlphaVantage API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants