Skip to content

Operations

Tim Sharii edited this page Apr 19, 2019 · 13 revisions

High-level operations are based, one way or another, on transforms and filters.

Here's the list of available operations:

  • Convolution
  • Cross-correlation
  • Block convolution
  • Resampling
  • Time-stretching
  • Rectification
  • Envelope detection
  • Spectral subtraction
  • Deconvolution

Each operation can be carried out using the corresponding static method of Operation class:

// convolution

var conv = Operation.Convolve(signal, kernel);
var xcorr = Operation.CrossCorrelate(signal1, signal2);

// block convolution

var filtered = Operation.BlockConvolve(signal, kernel, 4096, FilteringMethod.OverlapAdd);

// resampling

var resampled = Operation.Resample(signal, 22050);
var interpolated = Operation.Interpolate(signal, 3);
var decimated = Operation.Decimate(signal, 2);
var updown = Operation.ResampleUpDown(signal, 3, 2);

// tsm

var stretch = Operation.TimeStretch(signal, 0.7, TsmAlgorithm.PhaseVocoderPhaseLocking);
var cool = Operation.TimeStretch(signal, 16, TsmAlgorithm.PaulStretch);

// envelope following

var envelope = Operation.Envelope(signal);

// rectification

var halfRect = Operation.HalfRectify(signal);
var fullRect = Operation.FullRectify(signal);

// spectral subtraction

var clean = Operation.SpectralSubtract(signal, noise);

All of the above-mentioned functions are non-mutating and create new output signal for each input. These functions are OK for one-time call. For repeated operations it's usually better to call methods of special stateful classes responsible for each particular operation. Besides, there are some additional parameters that can be tweaked in methods (they weren't specified in the code above).

Convolver

Convolver class provides overloaded methods for carrying out fast convolution and cross-correlation via FFT. It works with its own internal buffers everytime its methods are called. Hence, this class should be used when processing signal sequentially, block after block.

var convolver = new Convolver(512); // FFT size

var kernel = new float[101];
// fill kernel

float[] output = new float[512];

convolver.Convolve(input1, kernel, output);
// do something with output

convolver.Convolve(input2, kernel, output);
// do something with output

convolver.Convolve(input3, kernel, output);
// do something with output


// cross-correlation has side-effect! it reverses second array

convolver.CrossCorrelate(input1, corr1, output);
// do something with output
// corr1 is now reversed

convolver.CrossCorrelate(input1, corr2, output);
// do something with output
// corr2 is now reversed

Remember, theoretical length of convolution/cross-correlation signal is input.Length + kernel.Length - 1. So the length of output array must be at least the nearest power of 2 to this value.

ComplexConvolver is similar class that convolves signals of type ComplexDiscreteSignal.

BlockConvolver

Two well-known techniques of block convolution are implemented:

  • Overlap-Add (OlaBlockConvolver class)
  • Overlap-Save (OlsBlockConvolver class)

Both classes implement IFilter and IOnlineFilter interfaces. Hence, they can be used as filters in offline and online processing of signals.

var kernel = firFilter.Kernel;
var processor = new OlaBlockConvolver(kernel, 4096);

// equivalent line:
var processor = OlaBlockConvolver.FromFilter(firFilter, 4096);

var filtered = processor.ApplyTo(signal);  // like any filter

Online processing:

// Overlap-Add / Overlap-Save

FirFilter filter = new FirFilter(kernel);

var blockConvolver = OlsBlockConvolver.FromFilter(filter, 16384);

// processing loop:
// while new input sample is available
{
     var outputSample = blockConvolver.Process(sample);
}

// or:
// while new input buffer is available
{
    blockConvolver.Process(input, output);
}

Note that the output will always be "late" by FftSize - KernelSize + 1 samples. The property (Ola|Ols)BlockConvoler.HopSize returns this value. So you might want to process first HopSize - 1 samples without storing the result anywhere (the samples will just get into delay line). For example, this is how offline method ApplyTo() is implemented for block convolvers:

var firstCount = Math.Min(HopSize - 1, signal.Length);

int i = 0, j = 0;

for (; i < firstCount; i++)    // first HopSize samples are just placed in the delay line
{
    Process(signal[i]);
}

var filtered = new float[signal.Length + _kernel.Length - 1];

for (; i < signal.Length; i++, j++)    // process
{
    filtered[j] = Process(signal[i]);
}

var lastCount = firstCount + _kernel.Length - 1;

for (i = 0; i < lastCount; i++, j++)    // get last 'late' samples
{
    filtered[j] = Process(0.0f);
}

See also OnlineDemoForm code.

Resampler

Resampler class provides methods for simple decimation, interpolation, up-down resampling (for small factors) and band-limited resampling:

// signal is sampled at 16kHz

var resampler = new Resampler();

var signal_22_5 = resampler.Resample(signal, 22050);    // band-limited resampling

var signal_8 = resampler.Decimate(signal, 2);           // downsample to 8 kHz
var signal_48 = resampler.Interpolate(signal, 3);       // upsample to 48 kHz
var signal_24 = resampler.ResampleUpDown(signal, 3, 2); // resample to 24 kHz

For simple decimation/interpolation/resampling the three latter methods will work faster. Bandlimited resampling resampling is universal and will work for any sampling rates.

All methods use anti-aliasing low-pass filtering under the hood. By default, the lowpass filter is designed inside the routines (of order 101), but you can specify your own anti-aliasing filter as the 3rd parameter:

var lpFilter = DesignFilter.FirLp(301, 0.5f / 2);
var resampled = resampler.Decimate(signal, 2, lpFilter);

var fasterFilter = DesignFilter.FirLp(51, 0.5f / 3);
resampled = resampler.Interpolate(signal, 3, fasterFilter);

EnvelopeFollower

EnvelopeFollower class implements IOnlineFilter interface. It's used, for instance, in AutoWah audio effect.

The constructor has three parameters: 1) sampling rate; 2) attack time; 3) release time.

var envelope = new EnvelopeFollower(samplingRate, 0.01f, 0.05f);

// while new input sample is available
{
    var envelopeSample = envelope.Process(sample);
    //...
}

envelope.Reset();

In principle, envelope detection could also be achieved with simple low-pass filtering, but EnvelopeFollower usually gives better results.

TSM (time scale modification)

Four well-known TSM algorithms are implemented. Each one is reflected in TsmAlgorithm enum:

  • Phase vocoder
  • Phase vocoder with identity phase locking
  • WSOLA (waveform similarity overlap-add)
  • Paul stretch algorithm

In general, phase vocoder with phase locking (PVIPL) produces best results, so it's used by default in time-stretching operations. Wsola is usually good for speech signals. PaulStretch is different: it produces interesting sounds for large stretch factors (10 and more).

Each algorithm is coded in separate class implementing IFilter interface.

var wsola = new Wsola(0.75, windowSize, hopSize, maxDelta);
wsola = new Wsola(0.75); // parameters will be estimated automatically

var pvipl = new PhaseVocoderPhaseLocking(0.75, hopSize, fftSize);
var pv = new PhaseVocoder(1.25, hopSize, fftSize);
var paulStretch = new PaulStretch(16, hopSize, fftSize);

var output1 = wsola.ApplyTo(signal);
var output2 = pv.ApplyTo(signal);
var output3 = pvipl.ApplyTo(signal);
var output4 = paulStretch.ApplyTo(signal);

Parameters fftSize and hopSize can be tweaked. But general recommendation is to set relatively small hop length (corresponding to about 8-15ms), while size of FFT must be at least 6-7 times longer (and power of 2). For example, in case of signals sampled at 16kHz parameters fftSize=1024, hopSize=128 are OK (the computations will take longer time, though. Bigger hop length will lead to faster processing and poorer results).

SpectralSubtractor

SpectralSubtractor class performs spectral subtraction according to

[1979] M.Berouti, R.Schwartz, J.Makhoul "Enhancement of Speech Corrupted by Acoustic Noise".

The class implements IFilter and IOnlineFilter interfaces.

// some noise signal is already measured or prepared

var subtractor = new SpectralSubtractor(noise, fftSize: 1024, hopSize: 300);
var clean = subtractor.ApplyTo(noisySignal);

// online:
// while input sample is available
{
    var outputSample = subtractor.Process(inputSample);
    //...
}

// noise can be re-estimated:

subtractor.EstimateNoise(newNoise);
clean = subtractor.ApplyTo(noisySignal);

Modulator

  • Amplitude modulation / demodulation
  • Frequency modulation / demodulation
  • Linear frequency modulation
  • Sinusoidal frequency modulation
  • Phase modulation
var modulator = new Modulator();

var ring = modulator.Ring(carrier, modulatorSignal);

var modAmp = modulator.Amplitude(carrier, 20/*Hz*/, 0.5f);
var demodAmp = modulator.DemodulateAmplitude(modAmp);

var modFreq = modulator.Frequency(baseband, carrierAmplitude: 1, carrierFrequency: 16000/*Hz*/, deviation: 0.1f);
var demodFreq = modulator.DemodulateFrequency(modFreq);

var modPhase = modulator.Phase(baseband, 1, 16000/*Hz*/, deviation: 0.1f);
Clone this wiki locally