Low-Latency-Keyword-Spotting-System

NLP Project for CS6120 at Northeastern University

Dataset download from Tensorflow

(You will need to login to Google to be able to download it)

After you have made this structure, begin by preprocessing the dataset.

Transfer Learning Using InceptionResnet
- To run this model, first run preprocess_inception.py to convert the .wav files to .png spectrogram images.
- Then run inception-resnetv2.py
Baseline CNN using MFCC Features
- To run this model, first run audio_feature_extraction.py file to convert the .wav file to mfcc feature array saved in a numpy array
- Then run mfccModel.py
Depth Separable CNN using MFCC Features
- To run this model, first run audio_feature_extraction.py file to convert the .wav file to mfcc feature array saved in a numpy array
- Then run mfccModel_dscnn.py
Baseline CNN using Logmel Filterbank Features
- To run this model, first run audio_feature_extraction_logmel.py file to convert the .wav file to logmel filterbank feature array saved in a numpy array
- Then run logmelModel.py
Depth Separable CNN using Logmel Filterbank Features
- To run this model, first run audio_feature_extraction_logmel.py file to convert the .wav file to logmel filterbank feature array saved in a numpy array
- Then run logmelModel_dscnn.py

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Project Final Report.pdf		Project Final Report.pdf
README.md		README.md