This repository implements real-time image captioning using the BLIP (Bootstrapped Language-Image Pretraining) model. The system captures live video from your webcam, generates descriptive captions for each frame, and displays them in real-time along with performance metrics.
- Real-Time Video Processing: Seamless webcam feed capture and display with overlaid captions
- State-of-the-Art Captioning: Powered by Salesforce's BLIP image captioning model (blip-image-captioning-large)
- Hardware Acceleration: CUDA support for GPU-accelerated inference
- Performance Monitoring: Live display of:
- Frame processing speed (FPS)
- GPU memory usage
- Processing latency
- Optimized Architecture: Multi-threaded design for smooth video streaming and caption generation
- Python 3.8+
- NVIDIA GPU (optional, for CUDA acceleration)
- Webcam
opencv-python>=4.5.0
torch>=1.9.0
transformers>=4.21.0
Pillow>=8.0.0
- Clone the repository:
git clone https://github.com/zawawiAI/BLIP_CAM.git
cd BLIP_CAM
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python BLIP_CAM.py
- Accessibility Tools: Real-time scene description for visually impaired users
- Content Analysis: Automated video content understanding and tagging
- Smart Conferencing: Enhanced video calls with automatic scene descriptions
- Educational Tools: Visual learning assistance and scene comprehension
- Security Systems: Intelligent surveillance with scene description capabilities
- Press
Q
to quit the application - Press
S
to save the current frame with caption - Press
P
to pause/resume caption generation
The application can be customized through the following parameters in config.py
:
- Frame processing resolution
- Caption update frequency
- GPU memory allocation
- Model confidence threshold
- Display preferences
This project is licensed under the MIT License - see the LICENSE file for details.
- Salesforce for the BLIP model
- PyTorch team for the deep learning framework
- Hugging Face for the transformers library
For questions and support, please open an issue in the GitHub repository or reach out to the maintainers.
⭐ If you find this project useful, please consider giving it a star!