In this additional exercise we will investigate the state-of-the-art algorithms deep deterministic policy gradient (DDPG) and proximal policy optimization (PPO) using PyTorch and investigate their performance in interaction with the Goddard's rocket problem. There, the optimal thrust profile for a vertically ascending rocket has to be found in order for it to reach the maximum possible altitude. In the end we will highlight examples using a common RL-toolbox, e.g. Stable Baselines 3.
- Write a DDPG algorithm using PyTorch
- Write a PPO algorithm using PyTorch