Breaking changes
Dependency
- Python 3.9 or later
- PyTorch v2.5.0 or later
OptimizerFactory
Import paths of OptimizerFactory
has been changed from d3rlpy.models.OptimizerFactory
to d3rlpy.optimizers.OptimizerFactory
.
# before
optim = d3rlpy.models.AdamFactory()
# after
optim = d3rlpy.optimizers.AdamFactory()
x2-3 speed up with CudaGraph and torch.compile
In this PR, d3rlpy supports CudaGraph and torch.compile to dramatically speed up training. You can just turn on this new feature by providing compile_graph
option:
import d3rlpy
# enable CudaGraph and torch.compile
sac = d3rlpy.algos.SACConfig(compile_graph=True).create(device="cuda:0")
Here is some benchmark result with NVIDIA RTX4070:
v2.6.2 | v2.7.0 | |
---|---|---|
Soft Actor-Critic | 7.4 msec | 3.0 msec |
Conservative Q-Learning | 12.5 msec | 3.8 msec |
Decision Transformer | 8.9 msec | 3.4 msec |
Note that this feature can be only enabled if you use CUDA device.
Enhanced optimizer
Learning rate scheduler
This release adds LRSchedulerFactory
that provides a learning rate scheduler to individual optimizer.
import d3rlpy
optim = d3rlpy.optimizers.AdamFactory(
lr_scheduler=d3rlpy.optimizers.CosineAnnealingLRFactory(T_max=1000000)
)
See an example here and docs here.
Gradient clipping
Now, clip_grad_norm
option has been added to clip gradients by global norm.
import d3rlpy
optim = d3rlpy.optimizers.AdamFactory(clip_grad_norm=0.1)
SimBa encoder
This release adds SimBa architecture that allows us to scale models effectively. See the paper here.
See docs here.
Enhancement
- Gradients are now being tracked by loggers (thanks, @hasan-yaman)
Development
- Replace black, isort and pylint with Ruff.
scripts/format
has been removed.scripts/lint
now formats code styles too.