This is a Deep Learning model used to generate description of an image. A CNN(InceptionV3 model) is used as an encoder and LSTM as a decoder where BLEU is used as an evaluation metric. The dataset used is MS-COCO. This project was a part of the course "Introduction to Deep Learning" provided by National Research University Higher School of Economics on Coursera.