Repo for implementation of keyword-based ASR system
- create virtual enviroment and install requirements from
requirements.txt
NOTE
: NeMo toolkit is not supported on Windows, so WSL or UNIX-based OS is required, see the docs or github- minimal, necessary data is already in the repo, but to reproduce training process and / or test other keywords you need to download full datasets from this link (if the link doesn't work, please contact me via email:
[email protected]
) and put them inData
directory (check Data README for the structure) - modify config file to match your setup (all paths with suffix
DATA_DIR
should be changed to match your setup)
- Data - directory with data
- Utils - directory containing utility scripts and config files
- Models - directory containing trained models' weights
- notebooks in root directory
- main
demo.ipynb
- notebook demonstrating dual-model keyword-based speaker recognition systemspeaker-recognition.ipynb
- notebook with speaker recognition model training and evaluationkeyword-recognition.ipynb
- notebook with keyword spotting model demonstration and evaluation
- suplementary
get-data.ipynb
- check audio files metadatavisualize-spectrograms.ipynb
- visualize spectrograms of audio filesplay-sound.ipynb
- sanity-check audio files
- main