In this notebook, everyday objects like vehicles, animals, birds, etc. have been classified by image processing techniques.
An approach to implement ResNet9 and 18 model architectures on the CIFAR-10 Dataset.
Dataset Link: Dataset
CIFAR-10 consists of 60000 '32X32' color images over 10 classes, each representing some common everyday object, with 6000 images per class.
In the notebook, the dataset is imported using torchvision from PyTorch.
PyTorch
Numpy
matplotlib
PIL(Python Imaging Library, for working with Images)
- The dataset has 10 classes and all have equal number of training examples(6000 each). So there is no class imbalance.
- The model architectures have been defined in separate classes, following the idea of Sequential API.
GPU is used for training.
Some methods used to fasten training processes:
Learning Rate Scheduling: This is to facilitate changing of the learning rate after each batch of training. Here we implement the "One cycle Learning Rate" policy, which involves increasing the learning rate for about 30% of epochs and then reducing it.
Weight Decay: This involves regularizing the weights, preventing them from becoming too large by adding an additional term to loss function.
Gradient Clipping: This involves restricting the gradient values to a small range to avoid undesirable changes in parameters.
The slight abruptness in the curve is possibly due to changing/scheduling learning rate.
Cross-Entropy Loss, which combines the negative log-likelihood(NLL) loss and log_softmax, is common for classification problems, compared to NLL loss. Reference: Cross_Entropy_Loss
Adam - Converges faster than SGD.
Around 90% validation accuracy, and 89% test accuracy.