PBRL

Introduction

This is a Pytorch implementation for our paper on

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning, ICLR 2022.

Install the package of rlkit with

cd d4rl
pip install -e .

For running PBRL on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --gpu 0

For running PBRL-Prior on the MuJoCo environments, run:

python examples/pevi_mujoco.py --env walker2d-medium-v2 --prior --gpu 0

For running PBRL on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --gpu 0

For running PBRL-Prior on the Adroit environments, run:

python examples/pevi_adroit.py --env pen-cloned-v0 --prior --gpu 0

The core implementation is given in d4rl/rlkit/torch/sac/pevi.py

The data for separate runs is stored on disk under the result directory with filename <env-id>-<timestamp>/<seed>/. Each run directory contains

The evaluation/d4rl score in debug.log or progress.csv records the normalized score in our paper.

In case of any questions, bugs, suggestions or improvements, please feel free to open an issue.