Repository for learning visual embeddings for a fashion dataset. Read our paper for all the details and results of our models.
There are two pipelines to run, training and evaluation, make sure you have python set up with the correct packages installed before running by following the section below. If it is the first time running the pipeline it may take a little longer to run as the dataset download and rescaling will need to be done, but these are both stored locally inside data, ready for future pipeline runs.
To run the pipeline in it's most basic form use:
$ python pipeline.py
There are many different options to train, such as what model type, where to save the trained models and various neural network training hyper parameters (if training neural network model type). You can display all of the available command line arguments for the pipeline by using the command:
$ python pipeline.py --help
To train a neural network it is advised to use a GPU (the default batch sizes are for a 24GB VRAM card so adjust accordingly to your hardware). The available network currently are: simple_net, big_net and facenet. To train facenet for 100 epochs use:
$ python pipeline.py --models_list facenet --epochs 100 --batch_size 64 --learning_rate 0.0001 --save_dir data/files_to_gitignore/models
The model weights will be saved inside the save_dir
along with training stats details. If a model doesn't finish training or you would like to train a model for longer after training then increase the epochs. Any previous model weights found which have identical training hyperparamters will be automatically loaded and training will continue where it was left. So make sure you delete any previous models if you want to re train from scratch!
To run the evaluation pipeline to display the closest embeddings use:
$ python evaluate_visually.py
You can specify different models with the --model
command line argument for example:
$ python evaluate_visually.py --model simple_net
Will evaluate the simple_net model with the latest training checkpoint.
By default all images from the eval set are show, to just show cases where the closest embeddings contain a correct similar image use:
$ python evaluate_visually.py --show_case pass
or only where the closest embeddings dont contain a similar image:
$ python evaluate_visually.py --show_case fail
By default only one image with its closest embeddings is shown, to increase it use:
$ python evaluate_visually.py --num_disp 10
To save all eval examples to an image file at data/files_to_gitignore/eval_figs
use:
$ python evaluate_visually.py --num_disp 100 --save_fig
By default 5 of the closest embeddings are shown, to change it use:
$ python evaluate_visually.py --num_neighbours 7
images of all embeddings from simple and simple_net model can be found evaluation/eval_figs
If you don't have the model's weights stored locally, the download them from the cloud using:
$ python evaluate_visually.py --model simple_net --download_weights
Which downloads the default simple neural network model. Use the ---weights_url
argument to change to a different model's storage location on the cloud. Model's must come in a zipped format.
Use the --checkpoint
argument with the number at which epoch the weights are saved at if you wish to evaluate previous models.
$ conda create -n snap_vision python=3.8
$ conda activate snap_vision
$ pip install -r requirements.txt
$ conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
The pipeline will automatically download and unzip the dataset, as well as downscale the image resolution, but if you want to just download the dataset in isolation then follow the instructions inside data to download and unzip the dataset automatically using python.
The miro board contains a visual representation of the pipeline and ideas/ ongoing work