A sophisticated real-time hand tracking system that combines depth estimation, gesture recognition, and interactive zone detection for creating engaging human-computer interaction experiences.
- Real-time Hand Tracking: Utilizes MediaPipe for accurate hand landmark detection
- Depth Estimation: Implements DepthAnything model for precise depth perception
- Interactive Zones: Create custom interaction areas with visual feedback
- MQTT Integration: Real-time communication for distributed systems
- Visual Feedback: Dynamic UI with progress tracking and status indicators
- Sound Effects: Audio feedback for enhanced user experience
- Performance Monitoring: Real-time FPS counter and system statistics
The system is built with a modular architecture that ensures high performance and maintainability:
HandTrack3D/
├── src/
│ ├── __init__.py
│ ├── main.py
│ ├── config.py
│ ├── hand_tracker.py
│ ├── depth_estimator.py
│ ├── interaction_system.py
│ └── utils/
│ ├── __init__.py
│ ├── visualization.py
│ └── mqtt_handler.py
├── assets/
│ ├── sounds/
│ │ ├── ring_1.mp3
│ │ └── ring_2.mp3
│ └── images/
│ ├── thumbs_up.png
│ └── thumbs_down.png
├── tests/
│ └── __init__.py
├── docs/
│ └── API.md
├── requirements.txt
├── LICENSE
└── README.md
- Clone the repository
git clone https://github.com/armanruet/HandTrack3D.git
cd HandTrack3D
- Install dependencies
pip install -r requirements.txt
- Run the application
python main.py
- Python 3.8+
- OpenCV
- MediaPipe
- PyTorch
- Paho-MQTT
- Pygame
- NumPy
- PIL
-
Launch the application
- The system will automatically access your camera
- A fullscreen window will open showing the camera feed
-
Create interaction zones
- Click and drag to draw up to 3 interaction zones
- Each zone will be automatically labeled (A, B, C)
-
Interact with zones
- Move your hand within the zones
- Watch for visual and audio feedback
- Monitor progress through the status bar
-
Controls
- Press 'q' to quit
- Press 'f' to toggle fullscreen
Key parameters can be adjusted in config.py
:
# UI Parameters
BOX_LINE_THICKNESS = 3
TARGET_BOX_COLOR = (0, 255, 0)
NON_TARGET_BOX_COLOR = (255, 0, 0)
TEXT_COLOR = (255, 255, 255)
# Depth Thresholds
DEPTH_THRESHOLD_NEAR = 0.20
DEPTH_THRESHOLD_FAR = 0.63
The system uses the DepthAnything model for accurate depth perception:
depth_anything = DepthAnything.from_pretrained(f"LiheYoung/depth_anything_{encoder}14")
Integrated MQTT broker for distributed system communication:
client.connect("your-broker-address", 1883, 60)
The system includes built-in performance monitoring:
- Real-time FPS counter
- Hand detection status
- Interaction timer
- Progress tracking
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
- MediaPipe team for their excellent hand tracking solution
- DepthAnything team for their depth estimation model
- All contributors and supporters of this project
For questions and support, please open an issue or contact the maintainers:
- Email: [email protected]
- LinkedIn: armanruet
Made with ❤️ by Arman