Image Enhancement - Project 1 for the course "CS7180 Advanced Perception"
SRCNN v2 model with parallel input convolution layer and concatenation
We have combined ideas from two papers in order to obtain better results for single image super-resolution, with additional novelties in the network structure. In order to keep a fairly simple network design, we took the SRCNN network structure as our baseline and made improvements to an already existing model. We use the VGG Perceptual loss instead of the standalone MSE loss used in the SRCNN paper. We further combined multiple input Convolution operations with varying filter sizes and concatenated the output from these input layers to have an understanding of features of different sizes. We trained the model with 32 X 32 patches from 5 images and validated using 50 original 2x bicubic interpolated images from the DIV2K2017 Dataset. The results of the 3 variations of SRCNN with parallel input convolutions and the perceptual loss have been compared and respective PSNR graphs have been plotted. These results show that using just the perceptual loss increases the training PSNR from 40.986 to 41.583 on the logarithmic scale. And employing the parallel input convolution layers increases the training PSNR from 41.583 to 45.027.
- Anirudh Muthuswamy
- Gugan Kathiresan
- Google Colab Ubuntu - Tesla T4 GPU
- macOS for file management
In your command line
cd /path/to/your/project pip install -r requirements.txt
- The experimentation and demo of code was performed inside the ipynb file named "Master-AnirudhGugan-cs7180.ipynb".
- The notebook is written in such a way that each section corresponds to a component of the project, and the user can just Run-all to understand the demo.
- While this is an ipynb file, each operation was defined as a user-defined function and called only in relevant locations.
- Certain aspects like Datasets and Model definition have been written in a class structure for easy inheritance and command line interfacing.
- The notebook was coded with the thought of reducing code duplication and maintaining software engineering principles in mind.
- But at the same time the ipynb was preferred to display outputs and visualizations beneficial for a research perspective
- Download the our version of the DIV2k dataset from the link https://www.kaggle.com/datasets/anirudhmuthuswamy/div2k-hr-and-lr
- Unzip the files and place it in the working directory
- Install all requirements from the "requirements.txt" file
- The command line instructions to replicate the results of the ipynb file is available in the main.py file. Run the commands as listed.