Skip to content

Releases: rlworkgroup/garage

2020.06.0

23 Jun 23:01
Compare
Choose a tag to compare

The Reinforcement Learning Working Group is proud to announce the 2020.06 release of garage.

As always, we are actively seeking new contributors. If you use garage, please consider submitting a PR with your algorithm or improvements to the framework.

Summary

Please see the CHANGELOG for detailed information on the changes in this release.

This released focused primarily on adding first-class support for meta-RL and multi-task RL. To achieve this, we rewrote the sampling API and subsystem completely, adding a Sampler API which is now multi-environment and multi-agent aware. We also added a library of baseline meta-RL and multi-task algorithms which reach state-of-the-art performance: MAML, PEARL, RL2, MTPPO, MTTRPO, MTSAC, Task Embeddings.

Highlights in this release:

  • First-class support for meta-RL and multi-task RL, demonstrated using the MetaWorld benchmark
  • More PyTorch algorithms, including MAML, SAC, MTSAC, PEARL, PPO, and TRPO (97% test coverage)
  • More TensorFlow meta-RL algorithms, including RL2 and Task Embeddings (95% test coverage)
  • All-new Sampler API, with first-class support for multiple agents and environments
  • All-new experiment definition decorator @wrap_experiment, which replaces the old run_experiment function
  • Continued improvements to quality and test coverage. Garage now has 90% overall test coverage
  • Simplified and updated the Docker containers, adding better support for CUDA/nvidia-docker2 and removing the complex docker-compose based system

Read below for more information on what's new in this release. See Looking forward for more information on what to expect in the next release.

First-class support for meta-RL and MTRL

We added first-class support for meta-RL and multi-task RL, including state-of-the-art performing versions of the following baseline algorithms:

We also added explicit support for meta-task sampling and evaluation.

New Sampler API

The new Sampler API allows you to define a custom worker or rollout function for your algorithm, to control the algorithm's sampling behavior. These Workers are agnostic of the sampling parallelization backend used. This makes it easy to customize sampling behavior without forcing you to write your own sampler.

For example, you can define one Worker and use it to collect samples inside the local process, or alternatively use it to collect many samples in parallel using multiprocessing, without ever having to interact with multiprocessing code and synchronization. Both RL2 and PEARL define custom workers, which allow them to implement the special sampling procedure necessary for these meta-RL algorithms.

The sampler is also aware of multiple policies and environments, allowing you to customize it for use with multi-task/meta-RL or multi-agent RL.

Currently-available sampling backends are:

  • LocalSampler - collects samples serially within the main optimization process
  • MultiprocessingSampler - collects samples in parallel across multiple processors using the Python standard library's multiprocessing library
  • RaySampler - collect samples in parallel using a ray cluster (that cluster can just be your local machine, of course)

The API for defining a new Sampler backend is small and well-defined. If you have a new bright idea for a parallel sampler backend, send us a PR!

New Experiment Definition API

We added the @wrap_experiment decorator, which defines the new standard way of declaring an experiment and its hyperparameters in garage. In short, an experiment is a function, and a hyperparameters are the arguments to that function. You can wrap your experiment function with @wrap_experiment to set experiment meta-data such as snapshot schedules and log directories.

Calling your experiment function runs the experiment.

wrap_experiment has features such as saving the current git context, automatically naming experiments, and automatically saving the hyperparameters of any experiment function it decorates. Take a look at the examples/ directory for hands-on examples of how to use it.

Improvements to quality and test coverage

Overall test coverage increased from 85% to 90% since v2019.10, and we expect this to keep climbing. We also now define standard benchmarks for all algorithms in the separate benchmarks directory.

Why we skipped 2020.02

Our focus on adding meta- and multi-task RL support required changing around and generalizing many APIs in garage. Around January 2020, this support existed, and we were in the process of polishing it for the February 2020 release. Around this time, our development was impacted by the COVID-19 pandemic, forcing many members of the garage core maintainers team to socially isolate in their homes, slowing down communication, and overall the development of garage. Rather than rushing to release the software during stressful times, the team decided to skip the February 2020 release and put together a much more polished version for this release milestone.

We intend to return to our regularly-scheduled release cadence for 2020.09.

Who should use this release, and how

Users who want to base a project on a semi-stable version of this software, and are not interested in bleeding-edge features should use the release branch and tags.

Platform support

This release has been tested extensively on Ubuntu 18.04 and 20.04. We have also used it successfully on Ubuntu 16.04 and macOS 10.13, 10.14, and 10.15.

Maintenance Plan

We plan on supporting this branch until at least February 2021. Our support will come mostly in the form of attempting to reproduce and fix critical user-reported bugs, conducting quality control on user-contributed PRs to the release branch, and releasing new versions when fixes are committed.

We haven no intention of performing proactive maintenance such as dependency upgrades, nor new features, tests, platform support, or documentation. However, we welcome PRs to the maintenance branch (release-2020.06) from contributors wishing see these enhancements to this version of the software.

Hotfixes

We will post backwards-compatible hotfixes for this release to the branch release-2020.06. New hotfixes will also trigger a new release tag which complies with semantic versioning, i.e. the first hotfix release would be tagged v2020.06.1, the second would be tagged v2020.06.2, etc.

We will not add new features, nor remove existing features from the branch release-2020.06 unless it is absolutely necessary for the integrity of the software.

Next release

We hope to release 2-3 times per year, approximately aligned with the North American academic calendar. We hope to release next around late September 2020, e.g. v2020.00.

Looking forward

The next release of garage will focus primarily on two goals: meta- and multi-task RL algorithms (and associated toolkit support) and stable, well-defined component APIs for fundamental RL abstractions such as Policy, QFunction, ValueFunction, Sampler, ReplayBuffer, Optimizer, etc.

Complete documentation

We are working feverishly to document garage and its APIs, to give the toolkit a full user manual, how-tos, tutorials, per-algorithm documentation and baseline curves, and a reference guide motivating the design and usage of all APIs.

Stable and well-defined component APIs

The toolkit has gotten mature-enough that most components have a fully-described formal API or an informal API which all components of that type implement, and large-enough that we have faith that our existing components c...

Read more

2020.05rc1

19 May 18:00
55d41b0
Compare
Choose a tag to compare
2020.05rc1 Pre-release
Pre-release

Pre-release of v2020.05

2020.04rc1

29 Apr 16:20
8d8051d
Compare
Choose a tag to compare
2020.04rc1 Pre-release
Pre-release

This is the second release candidate for the forthcoming v2020.04 release. It contains several API changes and improvements over the v2019.10 series, including more PyTorch algorithms and support for meta- and multi-task RL.

We encourage users to install release candidates if they'd like cutting-edge features without the day-to-day instability of installing from tip. Please see the release notes for v2019.10 for more info on what to expect in the v2020.04 release.

Note: due to COVID-19, the 2020.02 release has been delayed to April, and will be numbered v2020.04 to reflect this new reality.

2020.02.0rc1

09 Dec 21:31
4df2b74
Compare
Choose a tag to compare
2020.02.0rc1 Pre-release
Pre-release

This is the first release candidate for the forthcoming v2020.02 release. It contains several API changes and improvements over the v2019.10 series.

We encourage users to install release candidates if they'd like cutting-edge features without the day-to-day instability of installing from tip. Please see the release notes for v2019.10 for more info on what to expect in the v2020.02 release.

2019.10.1

09 Dec 21:27
26919a6
Compare
Choose a tag to compare

This is a maintenance release for 2019.10.

Added

  • Integration tests which cover all example scripts (#1078, #1090)
  • Deterministic mode support for PyTorch (#1068)
  • Install script support for macOS 10.15.1 (#1051)
  • PyTorch modules now support either functions or modules for specifying their non-linearities (#1038)

Fixed

  • Errors in the documentation on implementing new algorithms (#1074)
  • Broken example for DDPG+HER in TensorFlow (#1070)
  • Error in the documentation for using garage with conda (#1066)
  • Broken pickling of environment wrappers (#1061)
  • garage.torch was not included in the PyPI distribution (#1037)
  • A few broken examples for garage.tf (#1032)

2019.10.0

05 Nov 21:24
389dde7
Compare
Choose a tag to compare

The Reinforcement Learning Working Group is proud to announce the 2019.10 release of garage.

As always, we are actively seeking new contributors. If you use garage, please consider submitting a PR with your algorithm or improvements to the framework.

Summary

Please see the CHANGELOG for detailed information on the changes in this release.

This release contains an immense number of improvements and new features for garage.

It includes:

  • PyTorch support, including DDPG and VPG (94% test coverage)
  • Flexible new TensorFlow Model API and complete re-write of the TensorFlow neural network library using it (93% test coverage)
  • Better APIs for defining, running, and resuming experiments
  • New logging API with dowel, which allows a single log() call to stream logs of virtually any object to the screen, disk, CSV files, TensorBoard, and more.
  • New algorithms including (D)DQN and TD3 in TensorFlow, and DDPG and VPG in PyTorch
  • Distribution via PyPI -- you can now pip install garage!

Read below for more information on what's new in this release. See Looking forward for more information on what to expect in the next release.

Why we skipped 2019.06

After 2019.02 we made some large, fundamental changes in garage APIs. Around June these APIs were defined, but the library was in limbo, with some components using new APIs and other using old APIs. Rather than release a half-baked version, we decided our time was better spent getting the toolkit in shape for the next release.

We intend to return to our regularly-scheduled release cadence for 2020.02.

PyTorch Support

We added the garage.torch tree and primitives which allow you to define and train on-policy and off-policy algorithms in PyTorch.

Though the tree is small, the algorithms in this this tree achieve state-of-the-art performance, have 94% test coverage, and use idiomatic PyTorch constructs with garage APIs. Expect to see many more algorithms and primitives in PyTorch in future releases.

garage.tf.Model API and TensorFlow primitives re-write

The garage.tf.layers library quickly became a maintenance burden, and was hindering progress in TensorFlow.

To escape from under this unmaintainable custom library, we embarked on a complete re-write of the TensorFlow primitives around a new API called garage.tf.Model. This new API allows you to use idiomatic TensorFlow APIs to define reusable components for RL algorithms such as Policies and Q-functions.

Defining a new primitive in garage is easier than ever, and most components you want (e.g. MLPs, CNNs, RNNs) already exist as re-usable and composable Model classes.

Runner API and improvements to experiment snapshotting and resuming

We defined a new Runner API, which unifies how all algorithms, samplers, and environments interact to create an experiment. Using LocalRunner handles many of the important minutiae of running a successful experiment, including logging, snapshotting, and consistent definitions of batch size and other hyperparameters.

LocalRunner also makes it very easy to resume an experiment from an arbitrary iteration from disk, either using the Python API, or invoked from command line the garage command (e.g. garage resume path/to/experiment).

See the examples for how to run an algorithm using LocalRunner.

Log anything to anywhere with dowel

We replaced the garage.misc.logger package with a new flexible logger, which is implemented in a new package called dowel.

dowel has all of the features of the old logger, but a simpler well-defined API, and support logging any object to any number of outputs, provided a handler has been provided for that object and output. For instance, this allows us to log the TensorFlow graph to TensorBoard using a line like logger.log(tf.get_default_graph()), and a few lines below to log a message to the console like logger.log('Starting training...').

Dowel knows how to log key-value pairs, TensorFlow graphs, strings, and even histograms. Defining new logger outputs and input handlers is easy. Currently dowel supports output to the console, text files, CSVs, TensorBoard. Add your own today!

pip install garage

We delivered many improvements to make garage installable using only pip. You no longer need to run a setup script to install system dependencies, unless you'd like support for MuJoCo. We now automatically release new versions to pip.

This also means using garage with the environment manager of your choice is easy. We test virtualenv, pipenv, and conda in our CI pipeline to garage can always successfully install in your environment.

Extensive maintainability and documentation improvements

This release includes extensive maintainability and documentation improvements. Most of these are behind-the-scenes, but make an immense difference in the reliability and usability of the toolkit.

Highlights:

  • Unit test coverage increased from ~30% to ~80%
  • Overall test coverage increased from ~50% to ~85%
  • Overall coverage for garage.tf and garage.torch (which is where algorithm-performance critical code lives) is ~94%
  • TensorFlow and PyTorch algorithms are benchmarked before every commit to master
  • Every primitive is pickleable/snapshottable and this is tested in the CI
  • Docstrings added to all major APIs, including type information
  • API documentation is automatically generated and posted to https://garage.readthedocs.io
  • Large amounts of old and/or unused code deleted, especially from garage.misc

Who should use this release, and how

Users who want to base a project on a semi-stable version of this software, and are not interested in bleeding-edge features should use the release branch and tags.

Platform support

This release has been tested extensively on Ubuntu 16.04 and 18.04. We have also used it successfully on macOS 10.13, 10.14, and 10.15.

Maintenance Plan

We plan on supporting this branch until at least June 2020. Our support will come mostly in the form of attempting to reproduce and fix critical user-reported bugs, conducting quality control on user-contributed PRs to the release branch, and releasing new versions when fixes are committed.

We haven no intention of performing proactive maintenance such as dependency upgrades, nor new features, tests, platform support, or documentation. However, we welcome PRs to the maintenance branch (release-2019.10) from contributors wishing see these enhancements to this version of the software.

Hotfixes

We will post backwards-compatible hotfixes for this release to the branch release-2019.10. New hotfixes will also trigger a new release tag which complies with semantic versioning, i.e. the first hotfix release would be tagged v2019.10.1, the second would be tagged v2019.10.2, etc.

We will not add new features, nor remove existing features from the branch release-2019.02 unless it is absolutely necessary for the integrity of the software.

Next release

We hope to release 2-3 times per year, approximately aligned with the North American academic calendar. We hope to release next around early February 2020, e.g. v2020.02.

Looking forward

The next release of garage will focus primarily on two goals: meta- and multi-task RL algorithms (and associated toolkit support) and stable, well-defined component APIs for fundamental RL abstractions such as Policy, QFunction, ValueFunction, Sampler, ReplayBuffer, Optimizer, etc.

Meta- and Mulit-Task RL

We are adding a full suite of meta-RL and multi-task RL algorithms to the toolkit, and associated toolkit support where necessary. We would like garage to be the gold standard library for meta- and multi-task RL implementations.

As always, all new meta- and multi-task RL algorithms will be thoroughly tested and verified to meet-or-exceed the best state-of-the-art implementation we can find.

Stable and well-defined component APIs

The toolkit has gotten mature-enough that most components have a fully-described formal API or an informal API which all components of that type implement, and large-enough that we have faith that our existing components cover most current RL use cases.

Now we will turn to formalizing the major component APIs and ensuring that the components in garage all conform to these APIs. This will allow us to simplify lots of logic throughout the toolkit, and will make it easier to mix components defined outside garage with those defined inside garage.

Idiomatic TensorFlow model and tensorflow_probability

While the implementation of the primitives using garage.tf.Model is complete, their external API still uses the old style from rllab which defines a new feedforward graph for every...

Read more

2019.02.2

05 Nov 03:02
53cad5e
Compare
Choose a tag to compare

This is a maintenance release for 2019.02.

This is the final maintenance release for this version, as described in our maintenance plan.

Users should expect no further bug fixes for 2019.02, and should plan on moving their projects onto 2019.10 ASAP. Maintainers will accept PRs for the 2019.02 branch which fully conform to the contributor's guide, but will not proactively backport new fixes into the release branch.

This release fixes several small bugs:

  • Improper implementation of entropy regularization in TensorFlow PPO/TRPO (#579)
  • Broken advantage normalization was broken for recurrent policies (#626)
  • Bug in examples/sim_policy.py (#691)
  • FiniteDifferenceHvp was not pickleable (#745)

2019.02.1

05 Nov 02:54
a970a2a
Compare
Choose a tag to compare

This is a maintenance release for v2019.02.

This release fixes a bug (#622) in GaussianMLPRegressor which causes many on-policy algorithms to run slower with each iteration, eventually virtually-stopping the training process.

Projects based on v2019.02 are encouraged to upgrade ASAP.

2019.02

02 Mar 01:41
Compare
Choose a tag to compare

The Reinforcement Learning Working Group is proud to announce the 2019.02 release of garage.

We are actively seeking new contributors. If you use garage, please consider submitting a PR with your algorithm or improvements to the framework.

Summary

Please see the CHANGELOG for detailed information on the changes in this release.

Splitting garage into packages

Most changes in this released are focused on moving garage towards a modular future. We are moving the framework from a single monolithic repository to a family of independent Python packages, where each package serves a well-defined single purpose.

This will help garage have the widest impact by:

  • Allowing users to pick-and-choose which parts of the software fit well for their project, making using garage not an all-or-nothing decision
  • Making the the framework more stable, because smaller codebases are easier to test and maintain
  • Making it easier to introduce new frameworks (e.g. PyTorch) and features more easily, by forcing API separation between different parts of the software
  • Separating parts of the software at different maturity levels into different packages, making it easier for users to know which parts are stable and well-tested, and which parts are experimental and quickly-changing

In service of that goal, in this release we moved 3 packages to independent repositories with their own packages on PyPI (e.g. you can pip install <package>).

  • akro: Spaces types for reinforcement learning (from garage.spaces)
  • viskit: Hyperparamter-tuning dashboard for reinforcement learning experiments (from garage.viskit)
  • metaworlds: Environments for benchmarking meta-learning and multi-task learning (from garage.envs.mujoco and garage.envs.box2d)
  • gym-sawyer: Simulations and ROS bindings for the Sawyer robot, based on the openai/gym interface (from garage.envs.mujoco.sawyer and garage.envs.ros)

Deleting redundant or unused code

We've also started aggressively deleting unused code, or code where a better implementation already exists in the community. The largest example of this is MuJoCo and Box2D environments, many of which we removed because they have well-tested equivalents in openai/gym. Expect to find many other smaller examples in this and future releases.

Deleting Theano

We completed feature-parity between the Theano and TensorFlow trees, and deleted the Theano tree because we have not found any future interest in maintaining it. We made sure to port over all algorithms available in Theano to TensorFlow before making this change.

Preparing garage for PyTorch and other frameworks

We have started a full rewrite of the experiment definition, experiment deployment, snapshotting, and logging functionality in garage. This will allow new algorithm libraries or research projects to easily use garage tooling (e.g. logging, snapshotting, environment wrappers), irrespective of what numerical framework they use.

conda is now optional

While we still use conda in the CI environment for garage, we've moved all Python dependency information into a canonical setup.py file. While we are not releasing garage on PyPI yet, this means you can use any Python environment manager you'd like (e.g. pipenv, virtualenv, etc.) for your garage projects. In the future, we will add CI checks to make sure that the environment installs successfully in the most popular Python environment managers.

Primitives for pixel-based policies

We added CNN and wrapper primitives useful for pixel-based algorithms. Our implementation of DQN is forthcoming, since we are still benchmarking to make we can guarantee state-of-the-art performance.

Updated Docker support

We completely rewrote the garage Dockerfiles, added docker-compose examples for using them in your projects, and added a Makefile to help you easily execute your experiments using Docker (for both CPU and GPU machines). We use these Dockerfiles to run out own CI environment, so you can be sure that they are always updated.

Who should use this release, and how

Users who want to base a project on a semi-stable version of this software, and are not interested in bleeding-edge features should use the release branch and tags.

As always, we recommend existing rllab users migrate their code to a garage release ASAP.

Platform support

This release has been tested extensively on Ubuntu 16.04 and 18.04. We have also used it successfully on macOS 10.12, 10.13, and 10.14.

Maintenance Plan

We plan on supporting this branch until at least October 2019. Our support will come mostly in the form of attempting to reproduce and fix critical user-reported bugs, conducting quality control on user-contributed PRs to the release branch, and releasing new versions when fixes are committed.

We haven no intention of performing proactive maintenance such as dependency upgrades, nor new features, tests, platform support, or documentation. However, we welcome PRs to the maintenance branch (release-2019.02) from contributors wishing see these enhancements to this version of the software.

Hotfixes

We will post backwards-compatible hotfixes for this release to the branch release-2019.02. New hotfixes will also trigger a new release tag which complies with semantic versioning, i.e. the first hotfix release would be tagged v2019.02.1, the second would be tagged v2019.02.2, etc.

We will not add new features, nor remove existing features from the branch release-2019.02 unless it is absolutely necessary for the integrity of the software.

Next release

We hope to release 2-3 times per year, approximately aligned with the North American academic calendar. We hope to release next around early June 2019, e.g. v2019.06.

See Looking forward for more information on what to expect in the next release.

Looking forward

The next release of garage will focus primarily on two related goals: PyTorch support and completely-revamped component APIs. These are linked because gracefully supporting more than one framework requires well-defined interfaces for the sampler, logger, snapshotter, RL agent, and other components.

For TensorFlow algorithms development, we are focusing on adding a full suite of pixel-oriented RL algorithms to the TensorFlow tree, and on adding meta-RL algorithms to and associated new interfaces. We will also finish removing the custom layers library from the TensorFlow tree, and replacing it with code based on vanilla TensorFlow, a new abstraction called Model(inspired by the torch.nn.Module interface). We will also finish removing the custom garage.tf.distributions library and replacing it with fully-differentiable components from tensorflow-probability.

For PyTorch algorithms development, we hope to add garage support to a fork of rlkit, to prove the usefulness of our tooling for different algorithm libraries.

You can expect to see several more packages split from garage (e.g. the TensorFlow algorithm suite and experiment runner/sampler/logger), along with many API changes which make it easier to use those components independently from the garage codebase.

Contributors to this release

2018.10.1

01 Mar 02:11
Compare
Choose a tag to compare

This is a maintenance release for v2018.10. It contains several bug fixes on top of the v2018.10.0 release, but no new features and API changes.

We encourage projects based on v2018.10.0 to rebase onto v2018.10.1 without fear, so that they can enjoy better stability.