Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Update Mozilla released DeepSpeech

They achieve good error rates. Free Speech is in good hands, go there if you are an end user. For now this project is only maintained for educational purposes.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Partners + collaborators wanted

We are in the process of tackling this project in seriousness. If you want to join the party just drop us an email at [email protected].

Update: Nervana demonstrated that it is possible for 'independents' to build speech recognizers that are state of the art. Update: Mozilla is working on DeepSpeech and just achieved 0% error rate ... on the training set;) Free Speech is in good hands.

Fun tasks for newcomers

Watch video : https://www.youtube.com/watch?v=u9FPqkuoEJ8
Understand and correct the corresponding code: lstm-tflearn.py
Data Augmentation : create on-the-fly modulation of the data: increase the speech frequency, add background noise, alter the pitch etc,...

Extensions

Extensions to current tensorflow which are probably needed:

WarpCTC on the GPU see issue
Incremental collaborative snapshots ('P2P learning') !
Modular graphs/models + persistance

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out to [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 320 Commits
extra		extra
images		images
layer @ d438cfe		layer @ d438cfe
tensorpeers @ f571827		tensorpeers @ f571827
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
WarpCTC.txt		WarpCTC.txt
__init__.py		__init__.py
bdlstm_utils.py		bdlstm_utils.py
densenet_layer.py		densenet_layer.py
generate_speech_data.py		generate_speech_data.py
lstm-tflearn.py		lstm-tflearn.py
lstm_ctc_to_chars.py		lstm_ctc_to_chars.py
lstm_mfcc_ctc_to_words.py		lstm_mfcc_ctc_to_words.py
lstm_mfcc_to_chars.py		lstm_mfcc_to_chars.py
lstm_to_chars.py		lstm_to_chars.py
mfcc_feature_classifier.py		mfcc_feature_classifier.py
number_classifier_tflearn.py		number_classifier_tflearn.py
number_gan_layer.py		number_gan_layer.py
number_gan_tflearn.py		number_gan_tflearn.py
record-autoencoder.py		record-autoencoder.py
record.py		record.py
requirements.txt		requirements.txt
speaker_classifier_tflearn.py		speaker_classifier_tflearn.py
spectro_gan.py		spectro_gan.py
speech2text-seq2seq.py		speech2text-seq2seq.py
speech2text-tflearn.py		speech2text-tflearn.py
speech_data.py		speech_data.py
speech_encoder.py		speech_encoder.py
subtitle-downloader.py		subtitle-downloader.py
subtitle_srt_parser.py		subtitle_srt_parser.py
wave_GANerate.py		wave_GANerate.py
word_to_phonemes.swift		word_to_phonemes.swift

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorflow Speech Recognition

Update Mozilla released DeepSpeech

Ultimate goal

Getting started

Partners + collaborators wanted

Fun tasks for newcomers

Extensions

About

Releases

Packages

Languages

License

NiceMartin/tensorflow-speech-recognition

Folders and files

Latest commit

History

Repository files navigation

Tensorflow Speech Recognition

Update Mozilla released DeepSpeech

Ultimate goal

Getting started

Partners + collaborators wanted

Fun tasks for newcomers

Extensions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages