On Thu, Nov 14, 2013 at 9:37 AM, David Cournapeau <[email protected]>wrote:

>
> You can for example compare np.fft.fft(a) for 2**16 and 2**16+1 (and
> 2**16-1 that while bad is not prime, so only 1 order of magnitude slower).
>

I actually did...

Each step of a FFT basically splits a DFT of size N = P*Q into P DFTs of
size Q followed by Q DFTs of size P. If prime length DFTs take time
proportional to their length squared, the time complexity can be written as
N times the sum of the prime factors of N, which is N log N (actually 2 N
log N) for sizes a power of two.

So for 2^16 you actually have 2^16 * 2 * 16 = 2.1e6
For 2^16-1 = 3*5*17*257 you get (2^16-1)*(3+5+17+257) = 18.5e6 (8x slower)
For 2^16+1, which is prime, you get (2^16+1)^2 = 4295.0e6 (2045x slower)

On my system I get:
%timeit np.fft.fft(a) # a = np.ones(2**16)
100 loops, best of 3: 3.18 ms per loop

%timeit np.fft.fft(b) # b = np.ones(2**16 - 1)
100 loops, best of 3: 13.6 ms per loop (4x slower)

%timeit np.fft.fft(c) # c = np.ones(2**16 + 1)
1 loops, best of 3: 25.1 s per loop (8000x slower)

There clearly are some constant factors missing in this analysis, although
it gives reasonable order of magnitude predictions. Doing some more timings
it seems that:
 * prime sized inputs perform at least 2x worse than these formulas predict.
 * sizes with factors different than 2, perform about 2x better than these
formulas predict.

Jaime
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to