This exercise investigates the perspectives of learning from past experiences, which is called planning. The inverted pendulum is revisited for this task.
- Q learning with integrated planning from experience: Dyna-Q
- Dyna-Q with integrated planning from a simulation model