Skip to content

Implementation of "Goal-GAN" algorithm for DDPG agent in Pytorch

Notifications You must be signed in to change notification settings

SanteriHei/GoalGAN-DDPG

Repository files navigation

Automatic intermediate goal-generation for DDPG

Implementation for the BSc thesis "Intermediate goal generation for off-policy reinforcement learning methods"
Presentation slides | Thesis

This repository contains the code for reproducing the results from Bachelor thesis "Intermediate goal generation for off-policy reinforcement learning methods". More specifically, this work adapts the existing method of Florensa et al. ("Automatic goal generation for reinforcement learning agents") to DDPG, instead of the originally used TRPO.

Unfortunately, the proposed approach did not work, and instead the algorithm diverged. The below plots shows a possible outcome of the training:

Iteration 1 Iteration 6
Iteration 21 Iteration 100

Installation

  • First, install mujoco, the installation instructions can be found on their Github-page (NOTE: as of writing, only version 2.10 of mujoco is supported)

  • Then, follow the installation instruction from here. This tutorial shows step by step how the correct enviroment variables are set (for linux)

  • Note that mujoco-py supports only Linux and MacOS.

  • Pip installation:

    • Create a virtual enviroment using venv
      python -m venv path/to/virtual/enviroment
    • Activate the enviroment, with bash/zsh
    source <venv>/bin/activate

    where the is path to the just created virtual enviroment (see this for the command for e.g. csh or windows)

    • Install the needed packages
    pip install -r requirements.txt
  • Conda installation

    • Create virtual environment from the environment.yml file
    conda env create -f environment.yml
    • Activate the virtual environment
    conda activate <name-of-the-env>
  • Additionally, Pytorch is required. Refer to Pytorch's own documentation for installation guide. The code was written targeting v1.10.1.

Reproducing the results

To reproduce the results from the thesis, run

python main.py train --goal-count=150 --episode-count=100 --gan-iter-count=150 --buffer-size=10000 --actor-batch-norm --critic-batch-norm --save-after=10 --gan-save-path=path/where/models/saved/gan.tar --agent-save-path=path/where/models/saved/agent_model_name.tar
  • This will train the agent using exact same settings than in the thesis. Note that the script automatically appends the iteration count to the saved model name, so none of the saved models will be overwritten during the training. Note that this process is quite time consuming.

  • The CLI contains quite many different parameters that one can use to alter the training of the model without touching the underlying code, so just run

python main.py train --help

to see all the options.

Evaluating the performance of the trained Agent

To evaluate the performance of the trained agent, run

python main.py eval <path/to/the/model.tar> --buffer-size=10000 --critic-batch-norm --actor-batch-norm --render  
  • Note that the hyperparameters of the DDPG that were used during the training must be set also here (otherwise the model loading will fail to a mismatch of structures.)
  • The eval script also contains a few options to alter it's functionality, so check the help to see all possible options.

Acknowledgements

This project takes heavy inspiration from the work of Florensa et al. Thus, if you use this code, please cite the original work:

  @InProceedings{pmlr-v80-florensa18a,
    title = 	  {Automatic Goal Generation for Reinforcement Learning Agents},
    author =    {Florensa, Carlos and Held, David and Geng, Xinyang and Abbeel, Pieter},
    booktitle = {Proceedings of the 35th International Conference on Machine Learning},
    pages = 	  {1515--1528},
    year = 	    {2018},
    editor = 	  {Dy, Jennifer and Krause, Andreas},
    volume = 	  {80},
    series = 	  {Proceedings of Machine Learning Research},
    month = 	  {10--15 Jul},
    publisher = {PMLR},
    url = 	    {https://proceedings.mlr.press/v80/florensa18a.html},
  }

About

Implementation of "Goal-GAN" algorithm for DDPG agent in Pytorch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages