On 2/3/07, Charles R Harris <[EMAIL PROTECTED]> wrote:
On 2/3/07, Stephen Simmons <[EMAIL PROTECTED]> wrote: > > Hi, > > Does anyone know why there is an order of magnitude difference > in the speed of numpy's array.sum() function depending on the axis > of the matrix summed? > > To see this, import numpy and create a big array with two rows: > >>> import numpy > >>> a = numpy.ones([2,1000000], 'f4') > > Then using ipython's timeit function: > Time (ms) > sum(a) 20 > a.sum() 9 > a.sum(axis=1) 9 > a.sum(axis=0) 159 > numpy.dot(numpy.ones(a.shape[0], a.dtype), a) 15 > > This last one using a dot product is functionally equivalent > to a.sum(axis=0), suggesting that the slowdown is due to how > indexing is implemented in array.sum(). > In this case it is expected. There are inner and outer loops, in the slow case the inner loop with its extra code is called 1000000 times, in the fast case, twice. On the other hand, note this: In [10]: timeit a[0,:] + a[1,:] 100 loops, best of 3: 19.7 ms per loop Which has only one loop. Caching could also be a problem, but in this case it is dominated by loop overhead.
PS, I think this indicate that the code would run faster in this case if it accumulated along the last axis, one at a time for each leading index. I suspect that the current implementation accumulates down the first axis, then repeats for each of the last indices. This shows that rearranging the way the accumulation is done could be a big gain, especially if the largest axis is chosen. Chuck Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion