This Python application leverages the OpenRouter API and the Meta-Llama model to generate detailed captions for uploaded images.
- User-friendly Gradio interface for image uploads
- Option to generate short or long image descriptions
- Detailed, accurate, and concise image captions
- Powered by the advanced Meta-Llama 3.2 90B Vision Instruct model
- Robust error handling and logging for improved debugging
Before getting started, make sure you have:
- Python 3.7 or higher installed
- An OpenRouter API key
- The following Python libraries:
gradio
,requests
-
Clone this repository:
git clone https://github.com/PierrunoYT/llama-image-captioner.git cd llama-image-captioner
-
Install the required packages:
pip install -r requirements.txt
-
Create a
requirements.txt
file with the following content:gradio requests
-
Set up your environment variables:
- On Windows:
setx OPENROUTER_API_KEY "your_api_key_here" setx YOUR_SITE_URL "https://your-actual-site-url.com" setx YOUR_APP_NAME "Llama Image Captioner"
- On Unix-based systems:
export OPENROUTER_API_KEY="your_api_key_here" export YOUR_SITE_URL="https://your-actual-site-url.com" export YOUR_APP_NAME="Llama Image Captioner"
Note: After setting these environment variables, restart your command prompt or terminal for the changes to take effect.
- On Windows:
To run the application:
python ImageCaption.py
This will launch the Gradio interface. Follow these steps:
- Upload an image using the provided interface.
- Choose the caption length: "Short" for a brief description or "Long" for a detailed analysis.
- Click the submit button to generate the image caption.
- The app will display the generated description based on your chosen length.
- The user uploads an image through the Gradio interface.
- The image is converted to a base64-encoded string.
- A request is sent to the OpenRouter API, which uses the Meta-Llama model.
- The API returns a detailed description of the image.
- The description is displayed to the user through the Gradio interface.
Contributions to this project are welcome! Please follow these steps:
- Fork the repository
- Create a new branch (
git checkout -b feature/AmazingFeature
) - Make your changes
- Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) 2024 PierrunoYT
- OpenRouter for providing the API
- Meta for the Llama model
- Gradio for the user-friendly interface building tools