Sheriff Chase - sheriff.py
Implementation of value iteration, policy iteration and Q learning algorithms that search for the optimal policy. The game is a simple deterministic implementation of an OpenAI gym environment extended to Markov Decision Process.
How to play: Type L
to move left or R
to move right and have fun :)
Details: Markov Decision Process, Bellman Equation, Q-Learning