On Thu, Nov 14, 2013 at 9:37 AM, David Cournapeau <[email protected]>wrote:
> > You can for example compare np.fft.fft(a) for 2**16 and 2**16+1 (and > 2**16-1 that while bad is not prime, so only 1 order of magnitude slower). > I actually did... Each step of a FFT basically splits a DFT of size N = P*Q into P DFTs of size Q followed by Q DFTs of size P. If prime length DFTs take time proportional to their length squared, the time complexity can be written as N times the sum of the prime factors of N, which is N log N (actually 2 N log N) for sizes a power of two. So for 2^16 you actually have 2^16 * 2 * 16 = 2.1e6 For 2^16-1 = 3*5*17*257 you get (2^16-1)*(3+5+17+257) = 18.5e6 (8x slower) For 2^16+1, which is prime, you get (2^16+1)^2 = 4295.0e6 (2045x slower) On my system I get: %timeit np.fft.fft(a) # a = np.ones(2**16) 100 loops, best of 3: 3.18 ms per loop %timeit np.fft.fft(b) # b = np.ones(2**16 - 1) 100 loops, best of 3: 13.6 ms per loop (4x slower) %timeit np.fft.fft(c) # c = np.ones(2**16 + 1) 1 loops, best of 3: 25.1 s per loop (8000x slower) There clearly are some constant factors missing in this analysis, although it gives reasonable order of magnitude predictions. Doing some more timings it seems that: * prime sized inputs perform at least 2x worse than these formulas predict. * sizes with factors different than 2, perform about 2x better than these formulas predict. Jaime
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
