Skip to content

There are four models in this project: Deep Clustering Model, Hybrid Deep Clustering Model, U-net Model and UH-net Model. Models are trained on DSD100 dataset. The project is based on PyTorch.

Notifications You must be signed in to change notification settings

MortadhaMannai/VOCAL-TRACK-EXTRACTION-USING-NEURAL-NETWORKS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vocal Track Extraction

Author: Mortadha Manai

Report Link :https://github.com/MortadhaMannai/VOCAL-TRACK-EXTRACTION-USING-NEURAL-NETWORKS/blob/main/Report.pdf

Paper links :

1- zendo.org : https://zenodo.org/record/8274725

2- OpenAir.com : https://explore.openaire.eu/search/publication?pid=10.5281%2Fzenodo.8267702&fbclid=IwAR13OfUARkpyVk1jzk2fFoqaxVeNz2xbDwNySsu8vCV0FxwslG0eI8hqx90

Introduction

There are four models in this project: Deep Clustering Model, Hybrid Deep Clustering Model, U-net Model and UH-net Model. Models are trained on DSD100 dataset. The project is based on PyTorch.

Scripts

  • Data preprocess:

    • Build_Dataset.ipynb: generate dataset from DSD100
    • config.py: define project-level parameters
    • data_loader.py: define torch loader
    • mel_dealer.py: convert music file to melspectrogram and convert spectrogram back
  • Model defination:

    • unet_model.py: define U-net Model and UH-net Model
    • cluster_model.py: define Deep Clustering Model
    • hybrid_model.py: define Hybrid Deep Clustering Model
  • Model training:

    • utils.py: define loss functions
    • unet_train.py: train functions for u-net / uh-net model
    • hd_train.py: train functions for hybrid deep clustering model
    • dc_train.py: train functions for deep clustering model
    • train_dc.ipynb, train_hybrid.ipynb and train_unet.ipynb: train models
  • Model evaluation:

    • evaluation.py: define evaluation functions
    • music_decoder.py: retrieve audio file from model outputs

Current Sample Outputs

Audios

Original Music ( Vocal Track)
==> Hybrid Deep Clustering Model
==> U-net Model
==> UH-net Model

Masks

  • Masked Power Spectrograms:

  • Generated Masks:

About

There are four models in this project: Deep Clustering Model, Hybrid Deep Clustering Model, U-net Model and UH-net Model. Models are trained on DSD100 dataset. The project is based on PyTorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published