Am 29.1.2011 um 22:01 schrieb Nicolas SCHEFFER: > Hi all, > > First email to the list for me, I just want to say how grateful I am > to have python+numpy+ipython etc... for my day to day needs. Great > combination of software. > > Anyway, I've been having this bottleneck in one my algorithms that has > been bugging me for quite a while. > The objective is to speed this part up. I've been doing tons of > optimization and parallel processing around that piece of code to get > a decent run time. > > The problem is easy. You want to accumulate in a matrix, a weighted > sum of other matrices. Let's call this function scale_and_add: > def scale_and_add_re(R,w,Ms): > (nb_add,mdim,j)=np.shape(Ms) > for i in range(nb_add): > R+=w[i]*Ms[i] > return R > This 'for' loop bugs me since I know this will slow things down. > > But the dimension of these things are so large that any attempt to > vectorize this is slower and takes too much memory. > I typically work with 1000 weights and matrices, matrices of dimension > of several hundred. > > My current config is: > In [2]: %timeit scale_and_add_re(R,w,Ms) > 1 loops, best of 3: 392 ms per loop > > In [3]: w.shape > Out[3]: (2000,) > > In [4]: Ms.shape > Out[4]: (2000, 200, 200) > I'd like to be able to double these dimensions. > > For instance I could use broadcasting by using a dot product > %timeit dot(Ms.T,w) > 1 loops, best of 3: 1.77 s per loop > But this is i) slower ii) takes too much memory > (btw, I'd really need an inplace dot-product in numpy to avoid the > copy in memory, like the blas call in scipy.linalg. But that's for an > other thread...)
If you use a different memory layout for your data, you can improve your performance with dot: MsT = Ms.T.copy() %timeit np.dot(M,w) 10 loops, best of 3: 107 ms per loop for comparison: In [32]: %timeit scale_and_add_re(R,w,Ms) 1 loops, best of 3: 245 ms per loop I don't think you can do much better than this. The above value gives a memory bandwidth for accessing Ms of 6 GB/s, I think the hardware limit for my system is about 10 Gb/s. Gregor _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion