Skip to content

Latest commit

 

History

History
72 lines (55 loc) · 5.35 KB

README.md

File metadata and controls

72 lines (55 loc) · 5.35 KB

Implementation of GMM-HMM for speech Recognition using hmmlearn python package

Idea is to generate model which could recognize single words from short speech segments. I use GMM HMM for model.

alt text

This is medium article which explaines what and how.

Part of code is from https://github.com/jayaram1125/Single-Word-Speech-Recognition-using-GMM-HMM- I've refactored code and added some more features:

  • added MFCC delta and delta-delta features to increase accuracy of the model
  • script to record test audio to test your model(s)
  • trained model on original data from original repository but also took bunch of data from Speech Command Dataset
  • just for testing aligned Speech Command Dataset to gain higher accuracy

My trained models accuracy information is in models/accuracies directory. Original models are not included as they are too big. Only example fruit names model is in models [directory](https://github.com/RRisto/single_word_asr_gmm_hmm/tree/master/models. If you want to use them see example predict_google.py. You can record your own voice using record_test_audio.py

Script is tested on windows 10 using python 3.7.

Training Google Speech Commands Dataset model (original)

Another script uses data from Google Speech Commands Datasets but has only few categories for quicker training (it doesn't have unknown word and noise category)

Training very small fruit names dataset

Original data, good for debugging, not very useful for real-life speech recognition.

Aligning

This is just experiment I made. Original alignment was very good but this might improve model performance.

If you wan to align data and use it for training:

Run docker

There is also Docker image. To use it:

  • build image (run build_docker.bat)

  • run container (run run_docker.bat)

  • if you wan to use jupyter notebook:

      - go inside docker container: docker exec -it single_word_gmmhmm_run /bin/bash
      - start jupyter notebook server jupyter notebook --ip=0.0.0.0 --port=8888 --allow-root
      - go to your browser and copy: http://127.0.0.1:7006/
      - from terminal you should see notebook token, copy-paste it to browser and you should be inside jupyter notebook