Exercise 10

After looking at approximate prediction in exercise 09, we will now analyze the capability of function approximators in control for the MountainCar environment. For the considered setting, the state space is continuous while the action space is still discrete.

Tasks:

semi-gradient Sarsa control using artificial neural networks (and its limitations)
feature engineering for semi-gradient Sarsa control using radial basis functions
feature engineering for least squares policy iteration (LSPI) with recursive least squares Sarsa (RLS-Sarsa)