Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Workaround for performance regression introduced by FFTW
Convolutions in DSP currently rely on FFTW.jl, and a recent change in FFTW.jl (JuliaMath/FFTW.jl#105) has introduced a large performance regression in `conv` whenever Julia is started with more than one thread. Since v1 of FFTW.jl, it uses multi-threaded FFTW transformations by default whenever Julia has more than one thread. This new default causes small FFT problems to run much more slowly and use much more memory. Since the overlap-save method of `conv` in DSP breaks a convolutions into small convolutions, and therefore performs a large number of small FFTW transformations, this change can cause convolutions to be slower by two orders of magnitude, and similarly use two orders of magnitude more memory. While FFTW.jl does not provide an explicit way to set the number of threads used by a FFTW plan without changing a global variable, generating the plans with the planning flag set to `FFTW.PATIENT` (instead of the default `MEASURE`) allows the planner to consider changing the number of threads. Adding this flag to the plans generated by the overlap-save convolution method seems to rescue the performance regression on multi-threaded instances of Julia. Fixes JuliaDSP#399 Also see JuliaMath/FFTW.jl#121
- Loading branch information