Skip to content

Image to Image Translation using Conditional GANs (Pix2Pix) implemented using Tensorflow 2.0

Notifications You must be signed in to change notification settings

soumik12345/Pix2Pix

Repository files navigation

Pix2Pix

Binder PWC PWC PWC HitCount

Tensorflow 2.0 Implementation of the paper Image-to-Image Translation using Conditional GANs by Philip Isola, Jun-Yan Zhu, Tinghui Zhou and Alexei A. Efros.

Architecture

Generator

  • The Generator is a Unet-Like model with skip connections between encoder and decoder.
  • Encoder Block is Convolution -> BatchNormalization -> Activation (LeakyReLU)
  • Decode Blocks is Conv2DTranspose -> BatchNormalization -> Dropout (optional) -> Activation (ReLU)

Generator Architecture

Discriminator

  • PatchGAN Discriminator
  • Discriminator Block is Convolution -> BatchNormalization -> Activation (LeakyReLU)

Discriminator Architecture

Loss Functions

Generator Loss

Generator Loss Equation

The Loss function can also be boiled down to

Loss = GAN_Loss + Lambda * L1_Loss, where GAN_Loss is Sigmoid Cross Entropy Loss and Lambda = 100 (determined by the authors)

Discriminator Loss

The Discriminator Loss function can be written as

Loss = disc_loss(real_images, array of ones) + disc_loss(generated_images, array of zeros)

where disc_loss is Sigmoid Cross Entropy Loss.

Experiments with Standard Architecture

Resource Credits: Trained on Nvidia Quadro M4000 provided by Paperspace Gradient.

Dataset: Facades

Result:

Experiment 1 Result

Resource Credits: Trained on Nvidia Quadro P5000 provided by Paperspace Gradient.

Dataset: Maps

Result:

Experiment 2 Result

Resource Credits: Trained on Nvidia Tesla V100 provided by DeepWrex Technologies.

Dataset: Cityscapes

Result:

Experiment 3 Result

Experiments with Mish Activation Function

Resource Credits: Trained on Nvidia Quadro P5000 provided by Paperspace Gradient.

Dataset: Facades

Generator Architecture:

  • The Generator is a Unet-Like model with skip connections between encoder and decoder.
  • Encoder Block is Convolution -> BatchNormalization -> Activation (Mish)
  • Decode Blocks is Conv2DTranspose -> BatchNormalization -> Dropout (optional) -> Activation (Mish)

Discriminator:

  • PatchGAN Discriminator
  • Discriminator Block is Convolution -> BatchNormalization -> Activation (Mish)

Result:

Experiment 1 Mish Result

Resource Credits: Trained on Nvidia Tesla P100 provided by Google Colab.

Dataset: Facades

Generator Architecture:

  • The Generator is a Unet-Like model with skip connections between encoder and decoder.
  • Encoder Block is Convolution -> BatchNormalization -> Activation (Mish)
  • Decode Blocks is Conv2DTranspose -> BatchNormalization -> Dropout (optional) -> Activation (Mish)

Discriminator:

  • PatchGAN Discriminator
  • Discriminator Block is Convolution -> BatchNormalization -> Activation (ReLU)

Result:

Experiment 2 Mish Result

Resource Credits: Trained on Nvidia Quadro P5000 provided by Paperspace Gradient.

Dataset: Facades

Generator Architecture:

  • The Generator is a Unet-Like model with skip connections between encoder and decoder.
  • Encoder Block is Convolution -> BatchNormalization -> Activation (Mish)
  • Decode Blocks is Conv2DTranspose -> BatchNormalization -> Dropout (optional) -> Activation (Mish) for the first three blocks are Conv2DTranspose -> BatchNormalization -> Dropout (optional) -> Activation (ReLU)

Discriminator:

  • PatchGAN Discriminator
  • Discriminator Block is Convolution -> BatchNormalization -> Activation (ReLU)

Result:

Experiment 3 Mish Result

References

All the sources cited during building this codebase are mentioned below: