A project that utilizes Convolutional Neural Networks and feature selection techniques from audio data to create a music recommendation system and genre classification, providing a personalized and seamless listening experience.
Loading Audio Files
Utilize the librosa library to load audio files.
import librosay, sr = librosa.load('path\_to\_audio\_file.wav')
y is the audio time series, and sr is the sampling rate.
Fourier Transform(feature extraction)
Apply Fourier Transform to convert the time-domain audio signal into the frequency domain.
D = librosa.stft(y)
D is the short-time Fourier transform of the audio signal.
Mel-frequency Cepstral Coefficients (MFCC) Extraction
Extract MFCC features which are critical for audio analysis and classification.
mfccs = librosa.feature.mfcc(y=y, sr=sr, n\_mfcc=13)
mfccs is a matrix containing the MFCC features.
Feature Extraction
Essential for converting raw audio data into a format that is suitable.
Helps in capturing the relevant patterns and characteristics of the audio signal.
Convolutional Layers
Use multiple Conv2D layers to capture spatial features from the audio spectrograms.
Parameters include the number of filters, filter sizes, and strides.
from tensorflow.keras.layers import Conv2Dmodel.add(Conv2D(filters=32, kernel\_size=(3, 3), strides=(1, 1), activation='relu', input\_shape=input\_shape))
Activation Functions
Primarily use ReLU (Rectified Linear Unit) for non-linearity.
from tensorflow.keras.layers import Activationmodel.add(Activation('relu'))
Pooling Layers
Incorporate MaxPooling2D layers to downsample the feature maps.
from tensorflow.keras.layers import MaxPooling2Dmodel.add(MaxPooling2D(pool\_size=(2, 2)))
Dropout Layers
Use Dropout layers to prevent overfitting by randomly setting a fraction of input units to 0.
from tensorflow.keras.layers import Dropoutmodel.add(Dropout(rate=0.5))
Batch Normalization
Apply BatchNormalization to normalize the output of previous layers, accelerating training and improving performance.
from tensorflow.keras.layers import BatchNormalizationmodel.add(BatchNormalization())
Use Flatten to convert the 2D matrices into a 1D vector before passing to fully connected layers.
from tensorflow.keras.layers import Flattenmodel.add(Flatten())
Parameter Selection
Careful selection of parameters like the number of filters, kernel sizes, and dropout rates to balance performance and complexity.
Conv2D(filters=32, kernel\_size=(3, 3), strides=(1, 1))Dropout(rate=0.5)
Ensures that the model is both deep enough to capture complex patterns and regularized to prevent overfitting.
Made by:
Pranjal Vanjale
Siddhant Gupta
Vidhi Gupta
Yashovardhan Pandey
(Students of IIT Roorkee B.tech. First Year)
Mentored By: Shreshth Mehrotra
Ashwarya Rao Maratha