A generic terminal-based image classifier was the final project of the Deep Learning section of the Data Science Nanodegree with Udacity. It can learn from any set of labeled images and be used to label new images. The hyper-parameters can also be modified by the user from the command line itself. Finally the main model itself can be chosen by the user from all the possibilities within the TorchVision.Models library.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
The necessary python libraries are:
torch
torchvision
numpy
A step by step series of examples that tell you how to get a development env running
Say what the step will be
To install Torch
and Tochvision
using Anaconda with the latest version of cuda
:
conda install pytorch cuda92 -c pytorch
conda install torchvision -c soumith
Using pip:
pip install http://download.pytorch.org/whl/cu92/torch-0.4.1-cp36-cp36m-win_amd64.whl
pip install torchvision
The below terminal commands will install the necessary packages in Git Bash and train a convolutional network on Oxford university's flower data using default values.
cd ~
conda install pytorch cuda92 -c pytorch
pip install torchvision
git clone "https://github.com/beefupinetree/dsnd-image-classifier.git"
cd dsnd-image-classifier
wget "https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz"
tar xzvf flower_data.tar.gz
rm flower_data.tar.gz
mkdir flowers
mv -t flowers test train valid
python train.py flowers
You can access your newly trained model to classify any JPG picture, including all the ones in the 'Test' folder.
python predict.py "$(pwd)/flowers/test/59/image_05020.jpg" checkpoint --top_k 5 --gpu
You must have a collection of labeled images. For example, if we were classifying cats and dogs, each folder must contain instances of each kind of picture. The folder structure must stay as follows:
- cats_dogs
- test
- 1
- 2
- train
- 1
- 2
- valid
- 1
- 2
The folders labeled '1' only contain pictures of cats and folders labeled '2' only contain pictures of dogs.
Next we will need a JSON file assigning each number to the appropriate label. In this example it would be:
{"1": "Cat", "2": "Dog"}
To train the model on your data, run the following command in the shell:
python train.py <command> [options]
The command has one mandatory argument:
data_dir Directory with the labeled images within separate folders for 'test, train, valid' [string]
The options for training are:
-s, --save_dir Directory where the trained model is saved [string] [default: current directory]
-a, --arch Architecture of convolutional neural network [string] [default: vgg19]
-l, --learning_rate Learning rate of the optimizer [float] [default: 0.001]
--hidden_units Number of neurons per hidden layer [int] [default: 1000]
-e, --epochs Number of training epochs [int] [default: 3]
--gpu Automatically selects gpu for training, if available
Examples:
- Training a model on the data in the flowers directory by using the GPU for 10 epochs.
python train.py flowers -e 10 --gpu
- Training a neural network on the GPU with a VGG13 architecture
python train.py flowers -a vgg13 --gpu
The trained convolutional neural network architectures available in the Torchvision library can be found here. There are multiple versions of:
- VGG
- ResNet
- SqueezeNet
- Densenet
We can then use our brand new model to identify whatever it was trained on. Be it cats & dogs, articles or clothing, or types of flowers. To do that, run the following command in the shell:
python predict.py <command> [options]
The command has two mandatory arguments:
img_path Path to the image [string]
checkpoint Name of the model checkpoint to load [string]
The options for training are:
-category_names Mapping of categories to real names in JSON [string]
--top_k Top 'k' probable matches [int] [default: 1]
--gpu Automatically selects gpu for predicting, if available
Examples:
Using the saved model in 'checkpoint10' to predict whether 'animalpic' is a picture of a cat or a dog
python predict.py animalpic.jpg checkpoint10 --top_k 2 --gpu
Same thing as above, only this one outputs the names of the categories instead of their number. So we will see 'Cat' and 'Dog' instead of '1' and '2'
python predict.py animalpic.jpg checkpoint10 -category_names cat_to_name.json --gpu