This python project implements an activity recognition system trained on the UCF101 data set.
Some highlights include:
*A full pipeline from Dense Trajectory raw features to Fisher Vectors *Raw features are piped to the Fisher Vector generator, eliminating the need to store the large raw features *State-of-the-art performance for large action/event video datasets *Rudimentary baseline experiment implemented in baseline.py
Tunable parameters: *Number of modes used to construct Gaussian mixture models *Option to reduce dimensions of the 5 IDTF descriptors using PCA *Choice of classification model
The project depends on: *Improved Dense Trajectories: http://lear.inrialpes.fr/people/wang/improved_trajectories *Yael library: to compute the Fisher Vectors and GMMs. http://yael.gforge.inria.fr *scikit-learn: for classification models. http://scikit-learn.org/stable/ *UCF101 dataset: http://crcv.ucf.edu/data/UCF101.php
execute_script.sh illustrates the list of commands required to execute the pipeline.
Description of files:
-
computeIDTF.py *Resize the videos using ffmpeg.py file to 320x240 *Execute the IDTF binary, which is hard-coded in the file. *The IDTF binary can be downloaded at http://lear.inrialpes.fr/people/wang/improved_trajectories *saves temporary resized files in ./tmp directory
-
ffmpeg.py Python wrapper for basic ffmpeg operations. Used to resize the videos.
-
gmm.py *Computes the gmms.
Calls the computeIDTF module to generate the .feature video files that will be used to create GMMs These .feature files will be stored in the ./GMM_IDTFs directory. Each .feature file is on the order of 200MB.
Computes GMM for each of the 5 IDTF descriptor types (trajs, hogs, hofs, mbhxs, mbhys). Saves each of the GMMs in the specified file using python temporaryFile
-
IDT_feature.py *Handles a improved trajectory feature point The .feature files contain IDT features for each video. A single IDTF contains a few different pieces of information as described in http://lear.inrialpes.fr/people/wang/improved_trajectories
Most importantly, each IDTF contains each of the following descriptors.
- trajectory
- hog
- hof
- mbhx
- mbhy
-
computeFVs.py *extract IDTFs and compute the Fisher Vectors (FVs) for each of the videos in the input list (vid_in). The training and testing file splits are included in text files The Fisher Vectors are output in the ./UCF101_Fishers directory.
-
ThreadPool.py Python module I used to parallelize the computeFVs script.
-
computeFVstream Recieves stdin stream of IDTFs generated by the IDTF binary. Converts an IDTF from a video into a fisher vector. Prereq: GMM must have already been computed.
-
computeFV.py *Encodes a fisher vector. Requires IDTFs as input
-
compute_UC101_class_index.py Builds a dictionary of UCF101 class names to class index.
-
classify_library.py Custom library of methods useful when optimizing and working with the classifiers.
-
classify_experiment.py Script used to experiment with different settings for hyperparameters for SVM classifiers.
-
train_classify.py Python script to optimize the choice of hyperparameters for SVM classifiers. Outputs the results in the GridSearch_ouput file.
-
baseline.py Implements the baseline experiment. The baseline result involves extracting the first frame from each video and using this single image as the content to perform classification. The first frame for every video is resized to 240 x 320. PCA is performed on each image in order to reduce the dimensions. A multiclass linear SVM is used to classify the videos.