Image captioning generation using MiniGPT-4 and Vicuna pre-trained model

Description

This repository constitutes an implementation of an image captioner for large datasets, aiming to streamline the creation process of supervised datasets to aid in the data augmentation procedure for image captioning deep learning architectures.

The foundational framework utilized is the MiniGPT-4, supplemented by the pre-trained Vicuna model boasting 13 billion parameters.

Pre-requisite

You must have a GPU-enabled machine with a memory capacity of at least 23 GB.

Getting Started

Installation

git clone https://github.com/neemiasbsilva/MiniGPT-4-image-caption-implementation.git
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigptv
conda install pandas
mv MiniGPT-4/* ../.

Setup the shell script

In the shell file (run.sh) you have to specify:

data_path: the path where your image dataset are.
beam_search: hyperparameter that is a range 0 to 10;
temperature: hyperparameter (between 0.1 to 1.0);
save_path: local you have to save your caption data set.

Setup pre-trained models

Download the Vicuna 13 B
Set the LLM path minigpt4/configs/models/minigpt4_vicuna0.yaml in Line 15.
```
llama_model: "vicuna"
```
Download the MiniGPT-4 Checkpoint Model
Set the LLM path eval_configs/minigpt4_eval.yaml in Line 8.
```
ckpt: pretrained_minigpt4.pth
```

Usage

sh run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
minigpt4		minigpt4
.gitignore		.gitignore
README.md		README.md
image_caption.py		image_caption.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image captioning generation using MiniGPT-4 and Vicuna pre-trained model

Description

Pre-requisite

Getting Started

Installation

Setup the shell script

Setup pre-trained models

Usage

About

Releases

Packages

Languages

neemiasbsilva/MiniGPT4-image-caption-generation

Folders and files

Latest commit

History

Repository files navigation

Image captioning generation using MiniGPT-4 and Vicuna pre-trained model

Description

Pre-requisite

Getting Started

Installation

Setup the shell script

Setup pre-trained models

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages