A comprehensive collection of deep reinforcement learning (DRL) algorithm implementations, including DQN, A3C, PPO, and more, designed for research, experimentation, and educational purposes. This repository aims to provide accessible, well-structured code and resources for DRL enthusiasts, researchers, and developers working on classic control tasks.
- Multiple DRL Algorithms: Implementations of popular algorithms such as DQN, A3C, PPO, SAC, and others.
- Environment Compatibility: Supports a range of classic control environments like CartPole, LunarLander, MountainCar, etc.
- Modular Codebase: Each algorithm is implemented as a standalone module, making it easy to understand, modify, and extend.
- Logging & Visualization: Track training progress and visualize results with integrated logging.
- Flexible Training Parameters: Easily adjust hyperparameters for each algorithm to optimize performance or experiment with custom settings.
- Getting Started
- Algorithms Implemented
- Installation
- Usage
- Examples
- Results & Visualization
- Roadmap
- Contributing
- License
- Contact
Deep reinforcement learning combines deep learning with reinforcement learning, allowing agents to learn how to make sequences of decisions from raw sensory input. This repository offers code for several well-known DRL algorithms. The goal is to provide accessible implementations that are easy to integrate, experiment with, and use as a reference for research and educational projects.
The following algorithms are currently implemented:
- DQN (Deep Q-Network)
- DDPG (Deep Deterministic Policy Gradient)
- A3C (Asynchronous Advantage Actor-Critic)
- PPO (Proximal Policy Optimization)
- SAC (Soft Actor-Critic)
- TD3 (Twin Delayed DDPG)
Each algorithm includes a separate module that can be run and modified independently.
Clone this repository:
git clone https://github.com/yourusername/deep-reinforcement-learning-algorithms.git
cd deep-reinforcement-learning-algorithms
Install the required dependencies:
pip install -r requirements.txt
Note: This codebase requires Python 3.8+ and has been tested with PyTorch 1.8+ and Gym 0.26+.
To start training an agent, select the desired algorithm and environment.
python train.py --algorithm DQN --env LunarLander-v2
You can configure hyperparameters and other settings via command-line arguments. For example:
python train.py --algorithm PPO --env CartPole-v1 --learning_rate 0.0003 --batch_size 64
For more details on command-line arguments, refer to the CLI Options section.
Each training run logs data for rewards, loss, and other metrics. These can be visualized using tools like TensorBoard.
-
TensorBoard:
tensorboard --logdir runs
-
Video Rendering: Episodes are recorded (if enabled) and saved in the
videos/
folder, so you can watch the agent’s performance. -
Sample Output: Below is a sample performance graph of PPO on CartPole-v1:
Here are some example commands to get started with different algorithms:
- Train DQN on LunarLander-v2:
python train.py --algorithm DQN --env LunarLander-v2
- Train PPO on MountainCar-v0:
python train.py --algorithm PPO --env MountainCar-v0
- Train A3C on CartPole-v1:
python train.py --algorithm A3C --env CartPole-v1
We plan to add more algorithms, features, and environments over time. Upcoming features include:
- Implement Rainbow DQN
- Add support for Atari environments
- Integrate advanced logging options
- Provide pre-trained model checkpoints
Contributions are welcome! If you'd like to contribute, please follow these steps:
- Fork the project.
- Create a new branch (
git checkout -b feature/YourFeature
). - Make your changes.
- Commit your changes (
git commit -m 'Add YourFeature'
). - Push to the branch (
git push origin feature/YourFeature
). - Open a pull request.
For major changes, please open an issue first to discuss what you’d like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
If you have any questions, feel free to reach out:
- Author: Your Name
- Email: [email protected]