Skip to content

Convolutional and Recurrent Neural Networks combined to automatically generate image captions

License

Notifications You must be signed in to change notification settings

sdonatti/nd891-project-image-captioning

 
 

Repository files navigation

Udacity Computer Vision Nanodegree

Image Captioning

Combine convolutional and recurrent neural networks to build an automatic image captioning application.

teaser

Requirements

  1. Download and install Anaconda Python
  2. Create and activate a Conda environment

Set-up

Clone the project repository

git clone http://github.com/sdonatti/nd891-project-image-captioning

Install required Python packages

cd nd891-project-image-captioning
conda install --file requirements.txt -c pytorch

Install COCO API

git clone http://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py install

Download COCO Dataset

  • Under Annotations, download:

    • 2014 Train/Val annotations [241MB] (extract captions_train2014.json and captions_val2014.json at cocoapi/annotations/)
    • 2014 Testing Image info [1MB] (extract image_info_test2014.json at cocoapi/annotations/)
  • Under Images, download:

    • 2014 Train images [83K/13GB] (extract train2014 folder at cocoapi/images/)
    • 2014 Val images [41K/6GB] (extract val2014 folder at cocoapi/images/)
    • 2014 Test images [41K/6GB] (extract test2014 folder at cocoapi/images/)

Launch the project Jupyter Notebooks

cd ../../
jupyter notebook

License

This project is licensed under the MIT License

About

Convolutional and Recurrent Neural Networks combined to automatically generate image captions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.0%
  • Python 1.0%