This is implementation of the paper Spatio-Temporal Transformer Network for Video Restoration
The code was developed on python3 with pytorch 1.3.0 and PIL libraries. Please visit installation guide for pytorch installation. For installing the pillow simple type pip3 install pillow
on terminal
The code was trained on Deep Video Deblurring's
dataset which can be accessed from this link. Unzip it into a desired
folder. Alternatively, you can place your own videos under
dataset/qualitative_datasets/[video_file_name]/input
as input and
dataset/qualitative_datasets/[video_file_name]/GT
as ground truth videos
as frame extracted videos. This dataset structure can be used for both training and testing. You can extract a video into frames using ffmpeg with
the following command
ffmpeg -i file.mpg -r 1/1 $foldername/%04d.jpg
where $foldername
is desired folder for frame extraction
For training you need to call main_spatio.py file with the corresponding option parameters.
usage: main_spatio.py [-h] [--batchSize BATCHSIZE] [--nEpochs NEPOCHS]
[--lr LR] [--step STEP] [--cuda] [--resume RESUME]
[--start-epoch START_EPOCH] [--threads THREADS]
[--momentum MOMENTUM] [--weight-decay WEIGHT_DECAY]
[--pretrained PRETRAINED] [--gpus GPUS]
[--dataset DATASET]
optional arguments:
--batchSize BATCHSIZE Training batch size
--nEpochs NEPOCHS Number of epochs to train for
--lr LR Learning Rate. Default=0.1
--step STEP Sets the learning rate to the initial LR decayed by
momentum every n epochs, Default: n=10
--cuda Use cuda?
--resume RESUME Path to checkpoint (default: none)
--start-epoch START_EPOCH
Manual epoch number (useful on restarts)
--threads THREADS Number of threads for data loader to use, Default: 1
--momentum MOMENTUM Momentum, Default: 0.9
--weight-decay WEIGHT_DECAY, --wd WEIGHT_DECAY
Weight decay, Default: 1e-4
--pretrained PRETRAINED
path to pretrained model (default: none)
--gpus GPUS gpu ids (default: 0)
--dataset DATASET the folder where dataset can be found with specified
--model MODEL the model to be trained. Default: spatio temporal
transformer set by "spatio". Other options are "dvd" and "vdsr" for deep video deblurring and very deep super resolution method
structure
Example usage
python main_spatio.py --cuda --batchSize 32 --lr 0.1 --dataset /path/to/training/data
--model vdsr
Test is the eval_loop.py file. It takes both input and ground truth images, processes the input image using selected network and calculate the PSNR between model output and ground image and between ground truth image and input image
Testing is done with the eval_loop.py file
usage: eval_loop.py [-h] [--cuda] [--model MODEL] [--dataset DATASET]
[--gpus GPUS]
optional arguments:
-h, --help show this help message and exit
--cuda use cuda?
--model MODEL model path
--dataset DATASET dataset name
--gpus GPUS gpu ids (default: 0)
Example usage
python eval_loop.py --cuda --model /path/to/model/file --dataset /path/to/test/data