Multiple FFTs at once #9

antoine-levitt · 2019-07-27T20:53:26Z

FFTW supports multiple FFTs, and it says in the manual (http://www.fftw.org/fftw3.pdf p. 32)

Plans obtained in this way can often be faster than calling FFTW multiple times for the individual transforms

We should benchmark to see if there's really an improvement on timing (this doesn't seem to be a BLAS3 situation where there should be a large impact, but performance is weird, so I don't know...)

This would also nicely let us use FFTW multithreading optimally automatically.

One tricky complication is that with locking, the number of vectors we apply the Hamiltonian on changes from iteration to iteration. Maybe have one plan for the full nband (which is always needed at the first iteration anyway), and then fall back to the 1-by-1 version if less are required. We should look at how other codes do it.

mfherbst · 2019-07-28T07:14:34Z

Good idea. I would expect that for a large number of bands, using a "half-band" version is also reasonable, since it probably will take quite some iterations to go from 1 locked to half locked and in between you need to use the 1-band version for almost the complete band. But I agree, we should look at the other codes.

antoine-levitt · 2019-07-28T08:54:07Z

So I did a bit of digging.

If I understand correctly QE does not do multiple FFTs at once (PW/src/vloc_psi.F90 calls fwfft, defined in FFTXlib/fft_interfaces.F90, without the optional argument howmany).

Abinit... has a m_fft.F90 file that is 4940 lines long (so about twice the size of DFTK right now), which as far as I understand is . That's just the high-level interface though; if you add the low-level wrapper around FFTW3, it adds 15k lines. But of course there's also the FFT routines from Goedecker, that are still used by default when running under MPI, which add another 10k lines. So the only way to know what is actually going on is to run it through a debugger.

antoine-levitt · 2019-07-28T09:10:44Z

Just while I'm at it, here are some fun LOC stats. Counting just the "core" part (as best as I could find out what's core and what's not) of what DFTK does right now, excluding parallelisation etc. as far as I could:

Abinit: sloccount 28_numeric_noabirule/ 29_kpoints/ 32_util/ 41_xc_lowlevel/ 42_parser/ 44_abit* 46_diago/ 53_ffts/ 55_abiutil/ 56* 57* 56* 61* 62* 66_nonlocal/ 66_wfs/ 67_common/ 69_wfdesc/ 94_scfcv/ 98_main/ 83_cut3d/ => 190k

QE: sloccount LAXlib/ FFTXlib/ UtilXlib/ KS_Solvers/ PW/src/ => 85k (excluding the copy of an '97 version of FFTW they have in there)

antoine-levitt · 2019-07-28T09:28:45Z

GPAW is comparatively pretty good: the core part is about 30k lines of python, supports three different basis sets, with about 10k of C code for performance-critical parts.

antoine-levitt · 2019-07-28T18:48:20Z

So, reading the FFTW manual more closely, there is a "wisdom" mechanism for reusing measurements to create new plans; they say it's pretty smart, and so this might negate the problem of the changing nband (ie do the FFTW_MEASURE with the full nband, and reuse the wisdom to do a smaller number of bands)

mfherbst · 2019-07-29T06:51:18Z

Ah that can be done quite easily in fact.

mfherbst · 2019-08-04T14:36:40Z

This issue is related to the TODO given in https://github.com/mfherbst/DFTK.jl/blob/master/src/core/PlaneWaveBasis.jl#L76

antoine-levitt · 2019-12-01T17:15:20Z

Closing in favour of #15

mfherbst added the feature New feature or request label Jul 29, 2019

mfherbst added this to the The first release milestone Jul 29, 2019

mfherbst added performance Performance regression or performance-related and removed feature New feature or request labels Jul 29, 2019

mfherbst added the feature New feature or request label Oct 28, 2019

mfherbst mentioned this issue Oct 28, 2019

Use real FFTs #36

Closed

antoine-levitt mentioned this issue Nov 30, 2019

Threading #15

Closed

antoine-levitt closed this as completed Dec 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple FFTs at once #9

Multiple FFTs at once #9

antoine-levitt commented Jul 27, 2019 •

edited

Loading

mfherbst commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

mfherbst commented Jul 29, 2019

mfherbst commented Aug 4, 2019

antoine-levitt commented Dec 1, 2019

Multiple FFTs at once #9

Multiple FFTs at once #9

Comments

antoine-levitt commented Jul 27, 2019 • edited Loading

mfherbst commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

antoine-levitt commented Jul 28, 2019

mfherbst commented Jul 29, 2019

mfherbst commented Aug 4, 2019

antoine-levitt commented Dec 1, 2019

antoine-levitt commented Jul 27, 2019 •

edited

Loading