On 2/3/07, Charles R Harris <[EMAIL PROTECTED]> wrote:



On 2/3/07, Stephen Simmons <[EMAIL PROTECTED]> wrote:
>
>  Hi,
>
> Does anyone know why there is an order of magnitude difference
> in the speed of numpy's array.sum() function depending on the axis
> of the matrix summed?
>
> To see this, import numpy and create a big array with two rows:
>    >>> import numpy
>    >>> a = numpy.ones([2,1000000], 'f4')
>
> Then using ipython's timeit function:
>                                                   Time (ms)
>    sum(a)                                           20
>    a.sum()                                           9
>    a.sum(axis=1)                                     9
>    a.sum(axis=0)                                   159
>    numpy.dot(numpy.ones(a.shape[0], a.dtype), a)    15
>
> This last one using a dot product is functionally equivalent
> to a.sum(axis=0), suggesting that the slowdown is due to how
> indexing is implemented in array.sum().
>

In this case it is expected. There are inner and outer loops, in the slow
case the inner loop with its extra code is called 1000000 times, in the fast
case, twice. On the other hand, note this:

In [10]: timeit a[0,:] + a[1,:]
100 loops, best of 3: 19.7 ms per loop


Which has only one loop. Caching could also be a problem, but in this case
it is dominated by loop overhead.


PS, I think this indicate that the code would run faster in this case if it
accumulated along the last axis, one at a time for each leading index. I
suspect that the current implementation accumulates down the first axis,
then repeats for each of the last indices. This shows that rearranging the
way the accumulation is done could be a big gain, especially if the largest
axis is chosen.

Chuck

Chuck



_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to