Re: [Numpy-discussion] Scipy dot

2012-11-12 Thread Nicolas SCHEFFER
12, 2012 at 12:59 PM, Nathaniel Smith wrote: > On Mon, Nov 12, 2012 at 9:08 PM, Nicolas SCHEFFER > wrote: >> I've pushed my code to a branch here >> https://github.com/leschef/numpy/tree/faster_dot >> with the commit >> http

Re: [Numpy-discussion] Scipy dot

2012-11-12 Thread Nicolas SCHEFFER
>> >> http://dl.acm.org/citation.cfm?id=1356053 >> >> (Googling for "Anatomy of High-Performance Matrix Multiplication" will >> give you a preprint outside of paywall, but Google appears to not want >> to give me the URL of a too long search result so

Re: [Numpy-discussion] Scipy dot

2012-11-09 Thread Nicolas SCHEFFER
is not a surprise for me. The latter is far more >> cache friendly that the former. Everything follows cache lines, so it is >> faster than something that will use one element from each cache line. In >> fact it is exactly what "proves" that the new version is correct.

Re: [Numpy-discussion] Scipy dot

2012-11-09 Thread Nicolas SCHEFFER
ht'] - right)**2).sum()) Out[28]: 0.015331409 # # CCl # While the MSE are small, I'm wondering whether: - It's a bug: it should be exactly the same - It's a feature: BLAS is taking shortcuts when you have A.A'. The difference is not significant. Quick: PR that asap! I don&#

Re: [Numpy-discussion] Scipy dot

2012-11-09 Thread Nicolas SCHEFFER
I too encourage users to use scipy.linalg for speed and robustness (hence calling this scipy.dot), but it just brings so much confusion! When using the scipy + numpy ecosystem, you'd almost want everything be done with scipy so that you get the best implementation in all cases: scipy.zeros(), scipy

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
gt; blas version accept the same stuff, so if this isn't in the current version, > there will be probably some adjustment later on that side. What blas do you > use? I think ATLAS was one that was causing problem. > > > When we did this in Theano, it was more complicated then this di

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
wrote: > On Fri, 2012-11-09 at 00:24 +0100, Sebastian Berg wrote: >> Hey, >> >> On Thu, 2012-11-08 at 14:44 -0800, Nicolas SCHEFFER wrote: >> > Well, hinted by what Fabien said, I looked at the C level dot function. >> > Quite verbose! >> > >> &

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
ht be too easy to be true. On Thu, Nov 8, 2012 at 12:06 PM, Nicolas SCHEFFER wrote: > I've made the necessary changes to get the proper order for the output array. > Also, a pass of pep8 and some tests (fixmes are in failing tests) > http://pastebin.com/M8TfbURi > > -n >

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
I've made the necessary changes to get the proper order for the output array. Also, a pass of pep8 and some tests (fixmes are in failing tests) http://pastebin.com/M8TfbURi -n On Thu, Nov 8, 2012 at 11:38 AM, Nicolas SCHEFFER wrote: > Thanks for all the responses folks. This is indee

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
Thanks for all the responses folks. This is indeed a nice problem to solve. Few points: I. Change the order from 'F' to 'C': I'll look into it. II. Integration with scipy / numpy: opinions are diverging here. Let's wait a bit to get more responses on what people think. One thing though: I'd need t

Re: [Numpy-discussion] How to sum weighted matrices

2011-03-08 Thread Nicolas SCHEFFER
Or just with a dot: === In [17]: np.tensordot(weights, matrices, (0,0)) Out[17]: array([[ 5., 5., 5.], [ 5., 5., 5.]]) In [18]: np.dot(matrices.T,weights).T Out[18]: array([[ 5., 5., 5.], [ 5., 5., 5.]]) == make matrices.T C_CONTIGUOUS for maximum speed. -n On Mon, Mar

Re: [Numpy-discussion] Help in speeding up accumulation in a matrix

2011-01-30 Thread Nicolas SCHEFFER
Thalhammer wrote: > > Am 29.1.2011 um 22:01 schrieb Nicolas SCHEFFER: > >> Hi all, >> >> First email to the list for me, I just want to say how grateful I am >> to have python+numpy+ipython etc... for my day to day needs. Great >> combination of software.

Re: [Numpy-discussion] Help in speeding up accumulation in a matrix

2011-01-29 Thread Nicolas SCHEFFER
Thanks for the prompt reply! I quickly tried that and it actually helps compared to the full vectorized version. Depending on the dimensions, the chunk size has to be tuned (typically 100 or so) But I don't get any improvement w/r to the simple for loop (i can almost match the time though). My gue

[Numpy-discussion] Help in speeding up accumulation in a matrix

2011-01-29 Thread Nicolas SCHEFFER
Hi all, First email to the list for me, I just want to say how grateful I am to have python+numpy+ipython etc... for my day to day needs. Great combination of software. Anyway, I've been having this bottleneck in one my algorithms that has been bugging me for quite a while. The objective is to sp