Skip to content

Notes for non~experts in DSP

Tim Sharii edited this page Sep 12, 2021 · 11 revisions

This chapter is based on the most common questions asked by NWaves users.

There are some not obvious usage and performance tips, so feel free to ask questions!

1) How to setup my filter and get it working?

2) I want to get MFCCs like in Kaldi/HTK

3) I want to get MFCCs like in librosa

4) MfccExtractor produces strange results

5) Tips for efficient online data processing


How to setup my filter and get it working?

First of all, you need to define your filter parameters. Regardless of the filter type, at least one cutoff frequency must be specified. Since DSP deals with discrete-time signals and systems, you need to pass normalized frequency (the number within [0, 0.5] range). If you have frequency value in Hz then you need to divide it by the sampling frequency of your signal:

int samplingRate = signal.SamplingRate;

double freq = 1000;             // 1000 Hz, for example
double f = freq / samplingRate; // normalize frequency onto [0, 0.5] range

int order = 5;
var filter = new Butterworth.HighPassFilter(order, f);

// another example - FIR lowpass filter:
order = 25;
var lpf = DesignFilter.FirWinLp(order, f);

Now the filter is ready to be applied to input data.

Offline processing means that filter processes entire signal / array and produces new signal as output.

Online processing means that filter processes data sample after sample (or chunk after chunk).

Filters have state, so call filter.Reset() if you need to start filtering new data "from scratch".

// preferred way for offline filtering:
var filtered = filter.ApplyTo(signal);


// online filtering:

foreach (var sample in onlineSamples)
    filteredSample = filter.Process(sample)

// or new chunk is available:
//
void ProcessData(float[] data)
{
    filter.Process(data, output);
    //filter.Process(data, data);    // or in-place
}

Read more about online filtering

Second-Order Sections (SOS)

With IIR filters it's often preferable to work with their SOS decompositions.

Read more here:

https://colab.research.google.com/drive/1A_y7sTt3qJoQMyhSv-tOT_pv6-pk4a8d?usp=sharing

Single precision vs. double precision

In most cases single precision is sufficient for filtering operations, so NWaves classes/functions operate on data with single precision. However, alternative classes for filtering with double precision are also available (we're talking here about filtering operation itself; filter design and analysis is always done with double precision). They can be found in Filters.Base64 namespace (+ OlsBlockConvolver64 and OlaBlockConvolver64):

var tf = new Filters.Butterworth.BandPassFilter(4f/250, 8f/ 250, 5).Tf;

// it's ok: TransferFunction has always been with double precision, so use it here:
var filter = new Filters.Base64.IirFilter64(tf);

// offline filtering:

// now the filter carries out all its operations with double precision:
var filtered = signal.Samples.Select(s => filter.Process(s));

// filter.Process() accepts one sample of type 'double' and returns 'double'
// so we have Enumerable<double> and we can LINQ-materialize it whenever we want



// online filtering:

// while (double_sample)
//     filteredSample = filter.Process(double_sample)
//     we can downcast filteredSample to float if we need

I want to get MFCC like in Kaldi/HTK

Use MfccExtractor class with MfccHtkOptions:

var samplingRate = 16000;//Hz
var featureCount = 13;
var lowFreq = 200;//Hz
var highFreq = 6800;//Hz
var frameDuration = 0.025;//seconds

var opts = new MfccHtkOptions(samplingRate, featureCount, frameDuration, lowFreq, highFreq)
{
    HopDuration = 0.010,//seconds
};

var extractor = new MfccExtractor(opts);
var mfccs = extractor.ParallelComputeFrom(signal);

Keep in mind that HTK does the following pre-processing: zero-mean and pre-emphasis. Turn these settings off in HTK config if possible, because HTK does this pre-processing per frame instead of entire signal (which is weird given that frames overlap).

Also, HTK DOESN't normalize signal! If it's normalized then multiply by 32768 before processing:

    signal.Amplify(32768);
    var mfccs = extractor.ComputeFrom(signal);

It's really a mess ((

There are even more MFCC options - read here

I want to get MFCC like in librosa

There are couple of important nuances in librosa:

  • htk = true or false

This parameter essentially defines the weights of mel-filterbank (HTK-style or Slaney-style).

  • centering

In NWaves, like in many other frameworks, frames are not centered the way they are centered by default in librosa (in fact, I don't quite understand this 'centering' thing in librosa), so this parameter must be set to False.

Let's consider an example. Say we have the following setup in librosa:

mfccs = librosa.feature.mfcc(y, sr, n_mfcc=13,
dct_type=2, norm='ortho', window='hamming',
htk=False, n_mels=40, fmin=100, fmax=8000,
n_fft=1024, hop_length=int(0.010*sr), center=False)

In NWaves it is equivalent to:

int sr = 22050;           // sampling rate
int fftSize = 1024;
double lowFreq = 100;     // if not specified, will be 0
double highFreq = 8000;   // if not specified, will be samplingRate / 2
int filterbankSize = 40;  // or 24 for htk=true (usually)

// if 'htk=False' in librosa:
var melBank = FilterBanks.MelBankSlaney(filterbankSize, fftSize, sr, lowFreq, highFreq);

// if 'htk' parameter in librosa will be set to True, replace the previous line with these lines:
// var melBands = FilterBanks.MelBands(filterbankSize, sr, lowFreq, highFreq);
// var melBankHtk = FilterBanks.Triangular(fftSize, sr, melBands, null, Scale.HerzToMel);

var opts = new MfccOptions
{
    SamplingRate = sr,
    FrameDuration = (double)fftSize / sr,
    HopDuration = 0.010,
    FeatureCount = 13,
    Filterbank = melBank,              // or melBankHtk if htk=True
    NonLinearity = NonLinearityType.ToDecibel, // mandatory
    Window = WindowTypes.Hamming,     // in librosa 'hann' is by default
    LogFloor = 1e-10f,                // mandatory
    DctType="2N",
    LifterSize = 0
};

var extractor = new MfccExtractor(opts);
var mfccs = extractor.ParallelComputeFrom(signal);

And there are even more options - read here

MfccExtractor produces strange results

If you're getting a row of NANs at random frames samples, check that MfccExtractor instance is not shared by several threads (it's not thread-safe). If you need parallelization and better performance, call extractor.ParallelComputeFrom() method. If there's a crucial need for your own parallelization scheme, create a separate extractor for each task.

Tips for efficient online data processing

FIR filters and block convolution

If filter order exceeds approx. 60-65, don't call filter.Process(sample). Switch to block convolvers instead:

var kernel = DesignFilter.FirWinLp(431, 0.12);
var filter = new FirFilter(kernel);

var blockConvolver = OlsBlockConvolver.FromFilter(filter, 2048);
// usually the second parameter is approx. 4N, where N is filter order

// or without creating filter object:
var blockConvolver = OlsBlockConvolver(kernel, 2048);
// processing loop:
// while new input sample is available
{
    var outputSample = blockConvolver.Process(sample);
}

// or:
// while new input buffer is available
{
    blockConvolver.Process(input, output);
    // blockConvolver.Process(input, input);  // or in-place
}

SOS filtering

If you do IIR filtering and you face numerical problems with IirFilter/ZiFilter then switch to second-order sections (SOS):

  1. obtain SOS via DesignFilter.TfToSos()
  2. construct FilterChain and use it as a filter
var tf = new Butterworth.BandPassFilter(0.1, 0.16, 7).Tf;

TransferFunction[] sos = DesignFilter.TfToSos(tf);

var sosFilter = new FilterChain(sos);

var y = sosFilter.ApplyTo(x);

// or process samples online:
//    ... outSample = sosFilter.Process(sample);

If the filter is still amplifying signal too much, try re-estimating gain:

// ========== setup filter ===========
 
var order = 10;
var filter = new Filters.Butterworth.BandPassFilter(4f / 250, 8f / 250, order);
var sos = DesignFilter.TfToSos(filter.Tf);
var filters = new FilterChain(sos);
 
// =========== filtering ==============
 
var gain = filters.EstimateGain();
var filteredSignal = filters.ApplyTo(signal, gain);
 
// or in a loop
 
//while (<inputSample is available>)
//{
//    var outputSample = filters.Process(inputSample, gain);
//    ...
//}

Spectral analysis

Reuse memory whenever possible

Clone this wiki locally