Skip to content

Two implementations of speaker-diarization task with SNN, SincNet and Wav2Vec.

Notifications You must be signed in to change notification settings

MatiasDiBernardo/Speaker-Diarization-with-SNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speaker-Diarization-with-SNN

Final project for the Seminario en Aplicaciones de Redes Neuronales en la recuperación de información musical. The objetive is to use a Siamese Neuronal Network architecture in the Speaker Diarization task. We use the Librispeech dataset for training and validation.

First Implementation

The first implementation is in Keras and uses the SincNet architecture to lower the dimensionality of the convolutional task and work directly with the raw audio. With this approach we can obteain a good training error but the model does not generalize well and the validetion error was high.

Second Implementation

The second implementation is in PyTorch and uses the Wav2Vec model to extract the acoustic features of raw audio and proceed with this low dimensionality vector for the analysis.

About

Two implementations of speaker-diarization task with SNN, SincNet and Wav2Vec.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published