Hi, On Fri, Feb 11, 2011 at 12:08 AM, Robert Kern <robert.k...@gmail.com> wrote:
> On Thu, Feb 10, 2011 at 15:32, eat <e.antero.ta...@gmail.com> wrote: > > Hi Robert, > > > > On Thu, Feb 10, 2011 at 10:58 PM, Robert Kern <robert.k...@gmail.com> > wrote: > >> > >> On Thu, Feb 10, 2011 at 14:29, eat <e.antero.ta...@gmail.com> wrote: > >> > Hi Robert, > >> > > >> > On Thu, Feb 10, 2011 at 8:16 PM, Robert Kern <robert.k...@gmail.com> > >> > wrote: > >> >> > >> >> On Thu, Feb 10, 2011 at 11:53, eat <e.antero.ta...@gmail.com> wrote: > >> >> > Thanks Chuck, > >> >> > > >> >> > for replying. But don't you still feel very odd that dot > outperforms > >> >> > sum > >> >> > in > >> >> > your machine? Just to get it simply; why sum can't outperform dot? > >> >> > Whatever > >> >> > architecture (computer, cache) you have, it don't make any sense at > >> >> > all > >> >> > that > >> >> > when performing significantly less instructions, you'll reach to > >> >> > spend > >> >> > more > >> >> > time ;-). > >> >> > >> >> These days, the determining factor is less often instruction count > >> >> than memory latency, and the optimized BLAS implementations of dot() > >> >> heavily optimize the memory access patterns. > >> > > >> > Can't we have this as well with simple sum? > >> > >> It's technically feasible to accomplish, but as I mention later, it > >> entails quite a large cost. Those optimized BLASes represent many > >> man-years of effort > > > > Yes I acknowledge this. But didn't they then ignore them something > simpler, > > like sum (but which actually could benefit exactly similiar > optimizations). > > Let's set aside the fact that the people who optimized the > implementation of dot() (the authors of ATLAS or the MKL or whichever > optimized BLAS library you linked to) are different from those who > implemented sum() (the numpy devs). Let me repeat a reason why one > would put a lot of effort into optimizing dot() but not sum(): > > """ > >> However, they are frequently worth it > >> because those operations are often bottlenecks in whole applications. > >> sum(), even in its stupidest implementation, rarely is. > """ > > I don't know if I'm just not communicating very clearly, or if you > just reply to individual statements before reading the whole email. You are communicating very well. It's me who's not communicating so well. > > >> and cause substantial headaches for people > >> building and installing numpy. > > > > I appreciate this. No doubt at all. > >> > >> However, they are frequently worth it > >> because those operations are often bottlenecks in whole applications. > >> sum(), even in its stupidest implementation, rarely is. In the places > >> where it is a significant bottleneck, an ad hoc implementation in C or > >> Cython or even FORTRAN for just that application is pretty easy to > >> write. > > > > But here I have to disagree; I'll think that at least I (if not even the > > majority of numpy users) don't like (nor I'm be capable/ or have enough > > time/ resources) go to dwell such details. > > And you think we have the time and resources to do it for you? My intention was not to suggest anything like that. > > > I'm sorry but I'll have to > > restate that it's quite reasonable to expect that sum outperforms dot in > any > > case. > > You don't optimize a function just because you are capable of it. You > optimize a function because it is taking up a significant portion of > total runtime in your real application. Anything else is a waste of > time. > > > Lets now to start make such movements, which enables sum to outperform > > dot. > > Sorry, you don't get to volunteer anyone's time or set anyone's > priorities but your own. There are some sensible things one could do > to optimize sums or even general ufunc reductions, as outlined my > Mark, Pauli and myself, but please stop using the accelerated-BLAS > dot() as your benchmark. It's a completely inappropriate comparison. OK. Lets compare sum to itself: In []: M= randn(1e5, 1e2) In []: timeit M.sum(0) 10 loops, best of 3: 169 ms per loop In []: timeit M.sum(1) 10 loops, best of 3: 37.5 ms per loop In []: timeit M.sum() 10 loops, best of 3: 18.1 ms per loop In []: timeit sum(M.sum(0)) 10 loops, best of 3: 169 ms per loop In []: timeit sum(M.sum(1)) 10 loops, best of 3: 37.7 ms per loop In []: timeit M.T.sum(0) 10 loops, best of 3: 37.7 ms per loop In []: timeit M.T.sum(1) 10 loops, best of 3: 171 ms per loop In []: timeit M.T.sum() 1 loops, best of 3: 288 ms per loop In []: timeit sum(M.T.sum(0)) 10 loops, best of 3: 37.7 ms per loop In []: timeit sum(M.T.sum(1)) 10 loops, best of 3: 170 ms per loop In []: 288./ 18.1 Out[]: 15.91160220994475 > > >> You can gain speed by specializing to just your use case, e.g. > >> contiguous data, summing down to one number, or summing along one axis > >> of only 2D data, etc. There's usually no reason to try to generalize > >> that implementation to put it back into numpy. > > > > Yes, I would really like to specialize into my case, but 'without going > out > > the python realm.' > > The Bottleneck project is a good place for such things. It's a nice > middle-ground for somewhat-specialized routines that are still pretty > common but not general enough to be in numpy yet. > > http://pypi.python.org/pypi/Bottleneck > > If you're not willing to learn how to implement it yourself in Cython, > I'm afraid that you're stuck waiting for someone to do it for you. But > please don't expect anyone to feel obligated to do so. Perhaps my bad wordings, but I do not expect anyone to do it for me. I also think that there exists some real substance I addressed (as demonstrated now with comparing sum to itself). Regards, eat > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion