https://www.kaggle.com/c/state-farm-distracted-driver-detection
The predicted submission had a log loss score of 0.22855 on the private leader board and 0.23893 on the public leader board
Following packages specified in requirements.txt file need to be installed - h5py, keras, numpy, opencv-python, pandas, scikit-learn and tensorflow-gpu
- Create folder named weights inside the project directory
- Create folders named cache and input one step above the project directory
Download the competition data files from Kaggle Competition Data Link and place them in the data folder
Download the glove vector into project directory Pre-Trained Glove Word Vector
- The repository hosts 5 separate models all hosted in the '__main__.py' file
- It is advised to run the code in pieces
- Each model has its individual code file that is capable of producing independent submissions
- The final submissions use the 'convolution_vgg16.py' file that uses pretrained vgg_16 weights
- For reference, it took me about a total of 0.5 days on my HP Spectre i5 to run each of first 4 simple models
- The vgg_16 model was run in aws sagemaker. I finally ended up using a ml.p3.2xlarge instance with GPU support
- The 5 models take the approach of step by step improvement
- The 'convolution_quick' model uses 2 convolution layers and runs 2 epochs and demonstrates what a simple model can achieve
- The 'convolution_cross_validation' model uses a 10 fold cross validation and a smaller image size
- The 'convolution_drivers' v1 and v2 splits the images for cross validation by drivers and uses image transformations
- The final model improves on the last approach by using a much more complex vgg_16 architecture, pretrained weights and larger image sizes