Merge pull request #2 from panditanvita/dev

Dev
panditanvita · Jul 26, 2015 · 073c8cd · 073c8cd
2 parents 6136014 + 7965209
commit 073c8cd
Show file tree

Hide file tree

Showing 6 changed files with 186 additions and 39 deletions.
diff --git a/README.md b/README.md
@@ -1,18 +1,20 @@
 # MovieBot
 natural language movie requests for Magic Tiger
 
-<</in progress/>>
+--in progress--
 
 ***
 ###to play:
+
 run Bot file on python interpreter
 
-'''
+`
 bot = Bot()
+
 bot.run()
-'''
+`
 
-interact with bot on the console
+Interact with bot on the console
 ask for a movie that's in theatres. ask for 4 tickets. suggest a time of day.
 or suggest an exact time. or ask for a theatre and suggest a time of day.
 
@@ -21,16 +23,21 @@ Finish by either fulfilling a request, or if you input "bye"
 
 ***
 ###Purpose:
+
 MovieBot gives a valid response to every line of movie-related input
+
 Each bot corresponds to a single conversation, as the final product of a successful conversation should be a single
 completed movie request
+
 Bot has access to unchangeable dictionaries of movie names and theatres, which come from the knowledge base
 each bot instance keeps track of all its conversations.
 
 Two options for running:
-1. with a debug flag (in which case, call bot.run() to play with the features,
+
+1. With a debug flag (in which case, call bot.run() to play with the features,
 and the bot will interact using System.in and System.out
-2. without the debug flag, in which case the bot will keep track of its state in the MovieRequest and conversation
+
+2. Without the debug flag, in which case the bot will keep track of its state in the MovieRequest and conversation
 objects created at instantiation, and you must call the sleek_get_response(message) function to get the bot's
 response to a particular input
 
@@ -39,7 +46,18 @@ response to a particular input
 ***
 ###Design:
 
-idea of an 'expert system' with a knowledge base and logical rule-set. keeps track of state and can respond to certain movie-related inputs
+idea of an 'expert system' with a knowledge base and logical rule-set.
+
+keeps track of state in a State object and can respond to certain movie-related inputs
+
+State object has a question and option.
+Question corresponds to the attribute the bot is expecting to here about, and is used in the
+tagging functions, to favor entities which are indicated by the question. For example -
+if question is 1, then the tagging functions try harder to find a valid movie title.
+If the bot response involves multiple options, we want to make it easier for the customer
+to choose a specific one. Option field keeps track of given options, if the bot gave
+a list of valid theatres or movies, for example.
+
 
 ###Knowledge : scraping and parsing information from the internet/stored files
 
@@ -61,21 +79,22 @@ Speed: takes about five seconds to finish
 ###Tokeniser: tokenises and tags information from the customer
 
 ####tokeniser.py
-tokenizing, categorizing and tagging words done in here
+Tokenizing, categorizing and tagging words done in here
 
-tokeniser splits up incoming string into valid words, attempts to correct for slang,
+Tokeniser splits up incoming string into valid words, attempts to correct for slang,
 and tries to keep times and phone numbers as one token
 
-tagging done in tag_tokens_num (which looks for ticket numbers and times)
+Tagging is done in tag_tokens_num (which looks for ticket numbers and times)
 and tag_tokens_movies (which looks for movie titles and theatres).
 
-idea is to allow for some typos using the typo() function for all string comparisons
+The idea is to allow for some typos using the typo() function for all string comparisons
 
-currently theatre name is the hardest to select for, because the full theatre name is
+Currently the theatre name is the hardest to select for, because the full theatre name is
 never used - people will mention several keywords out of order like 'pvr koramangala' or
-'sri srinivasa', and those keywords may even match to multiple theatres. current
-implementation attempts to look for a subset of matching keywords , and narrows
-down the total space as far as possible
+'sri srinivasa', and those keywords may even match to multiple theatres.
+
+The current implementation attempts to look for a subset of matching keywords , and narrows
+down the total space as far as possible. It returns all best matching options.
 ***
 
 ***
@@ -84,23 +103,28 @@ down the total space as far as possible
 ####logic.py
 Logic is thought of in terms of cases: given a limited set of total
 attributes to fulfill, we want to fit in as many as possible while also
-making sure it is all mutually compatible. case 1: we have one movie, case 1.1: we have a movie
-and a theatre, is this movie playing at this theatre? case 2: we have two movies and so on..
-order of attempting-to-fit is movies - theatres - time. So it might input a movie
+making sure it is all mutually compatible.
+
+There are many cases and sub-cases. For example:
+Case 1: we have one movie in the list of tags
+Case 2: we have a movie and a theatre, is this movie playing at this theatre?
+case 3: we have multiple movies and one theatre. Which movies are playing at the theatre?
+Case 4: We have one movie and one time of day. Which theatres can we return?
+And so on.
+The order of attempting-to-fit multiple options is movies - theatres - time. So it might input a movie
 and a theatre and then fail at selecting the chosen time.
 
-narrow() takes in tokeniser output - the tagged movie/theatre/time/day entities
-it subcontracts the work. there is a submodule for each attribute, each function
-makes sure that what it inputs into the movie request is
-correct given all the other information it knows.
-each submodule updates the request object based on what it knows. each one creates a
+narrow() takes in tokeniser output - the tagged movie/theatre/time/day entities.
+It subcontracts the work to specific subfunctions for each kind of entity.
+Each function makes sure that what it inputs into the movie request is
+correct given all the other information it knows. Each function returned creates a
 potential return message for the customer.
 
-output of narrow() is given to eval()
-finally, decisions: do we have enough information?
-which questions must we ask to get more information?
-maybe the selected movie is not playing in the selected theatre?
-give alternate showtimes
+The output of narrow() is given to eval()
+Finally, bot must make decisions: Do we have enough information?
+Which questions must we ask to get more information?
+Maybe the selected movie is not playing in the selected theatre? What
+alternative options are there?
 eval() chooses which output to return.
 
 Note that eval() re-evaluates based on every time narrow() is called on a set
@@ -111,10 +135,13 @@ saved to the request object is lost.
 ***
 ####Further improvements:
 
-- google scraping doesn't return a lot of valid showtimes - need the book my show api
+(most important)
+- google scraping doesn't return a lot of the valid showtimes - need the book my show api
+- save all past information returned by the narrow() sub-functions in some sort of State
+object, which should keep track of both the question and the narrowed down options
+
 
-- save all past information returned by the narrow() sub-functions
-- long if/else cases in logic are awkward (but it seems to work)
+- long if/else cases in logic are awkward (but it seems to work). what alternatives?
 - options for choosing numbered answers still needs to be done
 - timeout for repeating the same question
 - options for choosing a different day (will need to scrape theatres for the

diff --git a/classes.py b/classes.py
@@ -54,7 +54,7 @@ def __init__(self, bms_name, address, company):
         Theatre.theatres.append(self)
 
     # String movie
-    # String[] timings ex: "10:30am"
+    # Time[] timings ex: "[Time('10:30am')]
     def put(self, movie, timings):
         self.check()
         self.movies[movie.lower()] = timings
@@ -106,13 +106,20 @@ def getAgentChat(self):
 
 '''
 classes for bot
+'''
+
+'''
+MovieRequest: object keep track of information that we are completely
+sure of, which fits in with all the other information that we have
+stored in the object
 
 Title is String movie title, cased
 num_tickets is Integer number of tickets
 Theatre is String Theatre.bms_name, cased
 date
 time is instance of Time, time of showing
 payment_method is 0 for COD, 1 for online
+
 (currently nothing to support payment_method or comments)
 '''
 class MovieRequest:
@@ -167,3 +174,32 @@ def readout(self):
                               self.title, self.theatre,t, self.date)
         return readout
 
+
+'''
+Keeping track of what we are learning.
+
+Int question: corresponds to index of attribute in request.done. Initialised as 0,
+which means the initial question is about the movie.
+
+Options keeps track of a list of options, whether of movies or theatres,
+where the option number is i+1, for index i of the item in the list
+
+list of keys
+
+Option is used in logic.py - if we are given a tagged numbers, and (there are
+multiple items in state.options, indicating that the last thing the bot said
+was a list of options AND the question isn't looking for time or  - there can be
+multiple showtimes that the bot returns as possible examples, but people will
+use the time value itself to refer to them, not the number), then
+we should use that number to correspond to the item numbered in the options,
+pick out that item, treat it like an equivalent to the case if tag_theats or
+tag_movs had a single item, and rewrite the option list, either to [] or to a new
+list
+hence it must be re-created every time logic module runs
+
+'''
+class State:
+    def __init__(self):
+        self.question = 0
+        self.options = []
+        self.option_type = 0 #for theatres, 1 for movies
diff --git a/knowledge.py b/knowledge.py
@@ -312,5 +312,11 @@ def f(i):
 
         url = startUrl + "&start=" + str(len(theatreList))
 
+    # add all theatres into dictionary, even if it doesn't have any movies for today
+    # that way, we can always recognise when a theatre is mentioned
+    for t in Theatre.theatres:
+        if t.bms_name.lower() not in namesToTheatres.keys():
+            namesToTheatres[t.bms_name.lower()] = t
+
     print("Knowledge base loaded")
     return namesToMovies, namesToTheatres, theatreList
diff --git a/logic.py b/logic.py
@@ -16,7 +16,7 @@
 '''
 def narrow_movies(req,tag_movs,ntm):
     r1 = 0, "Which movie?"
-    if req.done[0] != 1:
+    if req.done[0] != 1: # doesn't re-write if a movie is already selected
         if len(tag_movs) == 1:
             m_nice = ntm[tag_movs[0]].title
             req.add_title(m_nice)
@@ -61,13 +61,14 @@ def narrow_theatres(req,tag_theats,ntt):
         if req.done[0]:
             ft = [t for t in tag_theats if len(ntt[t].movies.get(mk, [])) > 0]
             if len(ft) == 0:
-                statement = "{} isn't playing there today".format(req.title)
+                statement = "{} isn't playing at any of those locations today".format(req.title)
             else:
                 ft_nice = [ntt[t].bms_name for t in ft]
-                statement = "{} is playing in: ".format(req.title) + '\n'.join(ft_nice)
-                # ['{}. {}'.format(i, t) for i, t in enumerate(ft)]
-                # not using because cannot support user choosing numbers
-                # but it would be nice
+                statement = "{} is playing in: ".format(req.title) \
+                            + '\n'.join(['{}. {}'.format(i, t) for i, t in enumerate(ft_nice)])
+                            #'\n'.join(ft_nice)
+                #
+                # support user choosing numbers!
 
                 # ['{}. {}'.format(i, t) for i, t in enumerate(tag_theats)]
             r2 = 2, statement
@@ -160,7 +161,7 @@ def get_options(time):
                             req.add_time(time1)
                             r4 = 1,""
             else:
-                #list of movies and theatres
+                #list of movies and theatres, cut off because it can get long
                 r4 = 2, statement[:400] + '...'
         else:
             # no movie, no theatre either

diff --git a/tests/knowledge.py b/tests/knowledge.py
@@ -0,0 +1,35 @@
+__author__ = 'V'
+
+from MovieBot.classes import *
+from MovieBot.showtime import *
+
+# for testing, we want known ntm, ntt dictionaries
+def get_info():
+    req1 = MovieRequest("test")
+    req2 = MovieRequest("test")
+
+    ntt, ntm = {}, {}
+
+    times1 = [Time('6pm'), Time('630pm'), Time('730pm')]
+    times2 = [Time('9am'), Time('1030am'), Time('1730pm'), Time('10pm')]
+
+    #title, description, theatres
+    m1 = Movie("Zabod")
+    m2 = Movie("Interesting Short Stories")
+
+    #bms_name, address, company
+    t1 = Theatre("t1",['outer','ring''road'], 'pvr')
+    t1.put('zabod',times1)
+    t2 = Theatre("t2",['outer','koramangala'], 'cinemax')
+    t2.put('zabod',times2)
+    t3 = Theatre("t3",['marathalli'], 'innovative multiplex')
+    t3.put('interesting short stories', times2)
+    ntt['t1'] = t1
+    ntt['t2'] = t2
+    ntt['t3'] = t3
+
+    ntt['zabod'] = m1
+    ntt['interesting short stories'] = m2
+
+    req1.add_title('zabod')
+    return req1, req2, ntm, ntt
diff --git a/tests/test_narrow.py b/tests/test_narrow.py
@@ -0,0 +1,42 @@
+__author__ = 'V'
+
+import unittest
+
+from MovieBot.logic import *
+
+from knowledge import get_info
+
+class Test_narrow(unittest.TestCase):
+    req1, req2, ntm, ntt = get_info()
+
+    def test_narrow_movies(self):
+        tag_movs = []
+        r1 = narrow_movies(self.req1,tag_movs,self.ntm)
+        self.assertEqual(r1, (0,"What movie?"))
+
+        tag_movs = ['zabod']
+        r1_ = narrow_movies(self.req1, tag_movs, self.ntm)
+        self.assertEqual(r1_, (0,"What movie?"))
+
+        r1_1 = narrow_movies(self.req2, tag_movs, self.ntm)
+        self.assertEqual(r1_1, (1,""))
+
+
+    def test_narrow_theatres(self):
+        tag_theats = ['t1']
+        r2 = narrow_theatres(self.req1,tag_theats,self.ntt)
+
+        self.assertTrue(r2, (0, "At which theatre?"))
+
+
+    def test_narrow_num(self):
+        self.assertTrue(True)
+
+
+
+def main():
+    unittest.main()
+
+
+if __name__ == '__main__':
+    main()