Skip to content

Autoencoders (U-Net) and DCGANs for CIFAR-10 colourization.

Notifications You must be signed in to change notification settings

m4mbo/recolor_cifar10

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Generative Approaches to CIFAR10 Colourization

This repository showcases two approaches to the coulourization task of CIFAR10 images: Auto Encoder U-Net and Conditional GAN. Over the last decade, the process of automatic colorization has been studied thoroughly due to its vast applications such as colourization of grayscale images and restoration of aged and/or degraded images. This problem is highly ill-posed due to the extremely large degrees of freedom during the assignment of color information. In this approach, I attempted to fully generalize this procedure using a conditional Deep Convolutional Generative Adversarial Network (DCGAN). The network is trained over a dataset that is publicly available, CIFAR10.

Architecture

The GAN architecture uses a U-Net, introduced by (Ronneberger et al., 2015, like fully convolutional architecture (with concatenation of opposite layers) for the generator, and the G loss function is L1 regularized, which produces an effect where the generator is forced to produce results that are similar to the ground truth on the pixel level. This will theoretically preserve the structure of the original images and prevent the generator from assigning arbitrary colors to pixels just to fool the discriminator. The generator takes the grayscaled image as an input, while the discrimator takes either the original image or generated image plus the condition, which is the grayscaled image in this case.

U-Net Architecture. Gray arrows represent the skip connections from encoder to the mirroring decoder layer.

DCGAN

Auto Enconder

Summary

GAN generated images had a clear visual improvement over those generated by the U-Net alone. These were more vibrant whereas, the results from the auto encoder suffered from a light hue and phenomenon referred as "sepia effect". In some cases, the GAN was able to nearly replicate the ground truth. It was even able to colorize reasonably well on one of the images where the ground truth was "close" to grayscale. Nonetheless, both models where trained for only 20 epochs, and it appeared that the loss function continued to have a negative gradient for the last epoch, so they probably need more time to converge.

Credits

  • m4mbo - Code
  • LMH summer program on 'AI and ML: Advanced Applications' - Theory