https://arxiv.org/abs/1604.06057
Models:
- Tabular Hierarchical Q-Learning
- h-DQN for discrete action space
- h-DQN for continuous action space
- Policy Gradient Methods
- Hierarchical Policy Gradients
Environments:
- Simple MDP
- FetchReach
- InvertedDoublePendullum
- Reacher