Skip to content

Marcusntnu/mlp_lenet_bathnorm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Training BatchNorm and Only BatchNorm: Affine parameter effects in MLP's

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contact
  6. Acknowledgments

About The Project

Part of a project in Deep Learning Applied AI at the University of Sapienza spring of 2022. The starting point was the paper "Training BatchNorm and only BatchNorm" where they investigated the effects of freezing all but batch normalization layers on residual neural nets. This project experiments with MLP's of varying dimensions on MNIST for comparison (the implementation also works for CIFAR-10). Also implemented a shallow CNN (mainly for seeing effects of a shallow non-residual CNN, but also for seeing effects of tuning on BatchNorm performance).

Findings suggest that BatchNorm does offer greater performance than the same number of random parameters. At least when going over a certain number of parameters, and effects generally increase as parameters increase from then on.

Getting Started

The notebooks are standard tensorflow/keras jupyter notebooks. For understanding more about what they are about I recommend reading the paper on training BatchNorm and only BatchNorm (link in acknowledgments).

Prerequisites

The notebooks run with jupyter and tensorflow/keras. They also work fine in google colab.

  • Jupyter notebook
  • Tensorflow
    pip install tensorflow
  • Keras-tuner if you want to do tuning. If this becomes an issue feel free to comment it out.
    pip install keras-tuner --upgrade

(back to top)

Usage

There are two notebooks, one for each architecture (LeNet CNN and MLP's). Each notebook has a tuning section at the bottom where the tuning is commented out. For this project the MLP notebook is the interesting one.

Using MNIST or CIFAR-10 is decided by setting a variable value at the top.

(back to top)

Roadmap

Self-evalution on further work that can be done on this project.

  • Optimize the random parameter freezing/unfreezing. Only in Keras for R is it possible to freeze certain weights, this could come soon and easily speed up runtime for the larger nets.
  • More rigorous MLP architecture design. As it is the dimensions and contents are somewhat simple and arbitrarily picked based on getting initial results.
  • Testing and tuning more hyperparameters. Also activation function positioning (before or after) and batch sizing.
  • Further experimenting with other datasets. Also extending it to non-computer vision sets.
  • Experiment with other architectures.

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contact

Your Name - [email protected]

Project Link: https://github.com/marcusntnu/mlp_lenet_bathnorm

(back to top)

Acknowledgments

(back to top)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published