Project for Deep Learning & Applied AI course - La Sapienza Università di Roma
The following study focuses on GAN approach (presented for the first time in 2014), used for the generation of new images. In the first chapter we will present the mathematical aspects behind this technique, while in the second we will present more concretely how GANs work. Presenting also the qualities that these have brought but at the same time those that are the problems from which they suffer. We will then see several of the various types of GANs (DCGAN, LSGAN, WGAN CDCGAN) presented so far and the changes from one to the other. Finally, starting from a chosen use-case, Lego Minifigures, we will see how these will behave in the creation of new samples, bringing what are the results of the experiments carried out.
LEGO face DATASET | LEGO minifigure DATASET |
---|---|
Deep Convolutional Generative Adversarial Network
Alec Radford, Luke Metz, Soumith Chintala
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
[Paper] [Notebook-face] [Notebook-minifigure]
LEGO face DCGAN | Loss plot |
---|---|
LEGO minifigure DCGAN | Loss plot |
---|---|
Least Squares Generative Adversarial Networks
Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley
Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson χ2 divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
[Paper] [Notebook-face] [Notebook-minifigure]
LEGO face LSGAN | Loss plot |
---|---|
LEGO minifigure LSGAN | Loss plot |
---|---|
Wasserstein GAN
Martin Arjovsky, Soumith Chintala, Léon Bottou
We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to other distances between distributions.
[Paper] [Notebook-face] [Notebook-minifigure]
LEGO face WGAN | Loss plot |
---|---|
LEGO minifigure WGAN | Loss plot |
---|---|
Conditional Deep Convolutional Generative Adversarial Nets
Mehdi Mirza, Simon Osindero
Generative Adversarial Nets [8] were recently introduced as a novel way to train generative models. In this work we introduce the conditional version of generative adversarial nets, which can be constructed by simply feeding the data, y, we wish to condition on to both the generator and discriminator. We show that this model can generate MNIST digits conditioned on class labels. We also illustrate how this model could be used to learn a multi-modal model, and provide preliminary examples of an application to image tagging in which we demonstrate how this approach can generate descriptive tags which are not part of training labels.
LEGO face CGAN | Loss plot |
---|---|