It is important to structure your data science project based on a certain standard so that your teammates can easily maintain and modify your project.
This repository provides a template that incorporates best practices to create a maintainable and reproducible data science project.
- hydra: Manage configuration files - article
- pdoc: Automatically create an API documentation for your project
- pre-commit plugins: Automate code reviewing formatting
- Poetry: Dependency management - article
Install Cookiecutter:
pip install cookiecutter
Create a project based on the template:
cookiecutter https://github.com/khuyentran1401/data-science-template
Want to learn more about building production-ready data science projects? Check out my upcoming book:
Production Ready Data Science: From Prototyping to Production with Python
The book will cover:
- Best practices for structuring data science projects
- Tools and techniques for reproducible research
- Deploying and monitoring machine learning models
- And much more!
Sign up now to receive the first 3 chapters for free! You'll also be notified when the full book becomes available.