Skip to content

Notes for non~experts in DSP

Tim Sharii edited this page Oct 10, 2020 · 11 revisions

This chapter is based on the most common issues/questions asked by NWaves users.

There are some not obvious usage and performance tips, so feel free to ask questions!

1) How to setup my filter and get it working?

2) I want to get MFCCs like in Kaldi/HTK

3) I want to get MFCCs like in librosa

4) What's the most efficient way of processing data online?

5) MfccExtractor produces strange results


How to setup my filter and get it working?

First of all, you need to define your filter parameters. Regardless of the filter type, at least one cutoff frequency must be specified. Since DSP deals with discrete-time signals and systems, you need to pass normalized frequency (the number within [0, 0.5] range). If you have frequency value in Hz then you need to divide it by the sampling frequency of your signal:

int samplingRate = signal.SamplingRate;

double freq = 1000;             // 1000 Hz, for example
double f = freq / samplingRate; // normalize frequency onto [0, 0.5] range

int order = 5;
var filter = new Butterworth.HighPassFilter(order, f);

// another example - FIR lowpass filter:
order = 25;
var lpf = DesignFilter.FirWinLp(order, f);

Reset filter.

// preferred way for offline filtering:
var filtered = filter.ApplyTo(signal);

// online filtering:
// foreach (sample)
//    filteredSample = filter.Process(sample)

Second-Order Sections (SOS)

https://colab.research.google.com/drive/1A_y7sTt3qJoQMyhSv-tOT_pv6-pk4a8d?usp=sharing

Single precision vs. double precision

In most cases. While coding filtering operations I was thinking more about audio processing, so default versions of functions operate on data with single precision, which is usually sufficient in "audio cases". But actually NWaves was intended to be a universal DSP lib. However, NWaves contains classes for filtering with double precision.

var tf = new Filters.Butterworth.BandPassFilter(4f/250, 8f/ 250, 5).Tf;

// it's ok: TransferFunction has always been with double precision, so use it here:
var filter = new Filters.Base64.IirFilter64(tf);

// offline filtering:

// now the filter carries out all its operations with double precision:
var filtered = signal.Samples.Select(s => filter.Process(s));

// filter.Process() accepts one sample of type 'double' and returns 'double'
// so we have Enumerable<double> and we can LINQ-materialize it whenever we want



// online filtering:

// while (double_sample)
//     filteredSample = filter.Process(double_sample)
//     you can downcast filteredSample to float if you need

I want to get MFCC like in Kaldi/HTK

TODO

I want to get MFCC like in librosa

There are couple of important nuances in librosa:

  • htk = true or false

This parameter essentially defines the weights of mel-filterbank (HTK-style or Slaney-style).

  • centering

In NWaves, like in many other frameworks, frames are not centered the way they are in librosa (in fact, I don't quite understand its purpose...), so this parameter must be set to False.

Let's just consider an example. Say we have the following setup in librosa:

mfccs = librosa.feature.mfcc(y, sr, n_mfcc=13,
dct_type=2, norm='ortho', window='hamming',
htk=False, n_mels=40, fmin=100, fmax=8000,
n_fft=1024, hop_length=int(0.010*sr), center=False)

In NWaves it is equivalent to:

int sr = 22050;                  // sampling rate
int fftSize = 1024;
double lowFreq = 100;     // if not specified, will be 0
double highFreq = 8000; // if not specified, will be samplingRate / 2
int filterbankSize = 40;     // or 24 for htk=true (usually)

// if 'htk' parameter in librosa will be set to False:
var melBank1 = FilterBanks.MelBankSlaney(filterbankSize, fftSize, sr, lowFreq, highFreq);

// if 'htk' parameter in librosa will be set to True:
var melBands = FilterBanks.MelBands(filterbankSize, sr, lowFreq, highFreq);
var melBank2 = FilterBanks.Triangular(fftSize, sr, melBands, null, Scale.HerzToMel);

var opts = new MfccOptions
{
    SamplingRate = sr,
    FrameDuration = (double)fftSize / sr,
    HopDuration = 0.010,
    FeatureCount = 12,
    Filterbank = melBank1,  // or MelBank2
    NonLinearity = NonLinearityType.ToDecibel, // mandatory
    Window = WindowTypes.Hamming,     // in librosa 'hann' is by default
    LogFloor = 1e-10f,  // mandatory
    DctType="2N",
    LifterSize = 0
};
var e = new MfccExtractor(opts);

And there are even more options - read here

What's the most efficient way of processing data online?

TODO

(FilterChain, Block convolvers)

MfccExtractor produces strange results

If you're getting a row of NANs at random frames samples, check that MfccExtractor instance is not shared by several threads (it's not thread-safe). If you need parallelization and better performance, call extractor.ParallelComputeFrom() method. If there's a crucial need for your own parallelization scheme, create a separate extractor for each task.

Clone this wiki locally