Skip to content

Release v2.5.0

Compare
Choose a tag to compare
@takuseno takuseno released this 11 May 09:38
· 56 commits to master since this release

New Algorithm

Cal-QL has been added to d3rlpy in v2.5.0! Please check a reproduction script here. To support faithful reproduction, SparseRewardTransitionPicker has been also added, which is used in the reproduction script.

Custom Algorithm Example

One of the frequent questions is "How can I implement a custom algorithm on top of d3rlpy?". Now, the new example script has been added to answer this question. Based on this example, you can build your own algorithm while you can utilize a whole training pipeline provided by d3rlpy. Please check the script here.

Enhancement

  • Exporting Decision Transformer models as TorchScript and ONNX has been implemented. You can use this feature via save_policy method in the same way as you use with Q-learning algorithms.
  • Tuple observation support has been added to PyTorch/ONNX export.
  • Modified return-to-go calculation for Q-learning algorithms and skip this calculation if return-to-go is not necessary.
  • n_updates option has been added to fit_online method to control update-to-data (UTD) ratio.
  • write_at_termination option has been added to ReplayBuffer.

Bugfix

  • Action scaling has been fixed for D4RL datasets.
  • Default replay buffer creation at fix_online method has been fixed.