Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-22 Thread Gael Varoquaux
On Sun, Nov 21, 2010 at 09:03:22PM -0500, Wes McKinney wrote: > Maybe let's have the next thread on SciPy-user-- I think what we're > talking about is general enough to be discussed there. Yes, a lot of this is of general interest. I'd be particularly interested in having the NaN work land in sci

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Wes McKinney
On Sun, Nov 21, 2010 at 6:37 PM, Keith Goodman wrote: > On Sun, Nov 21, 2010 at 3:16 PM, Wes McKinney wrote: > >> What would you say to a single package that contains: >> >> - NaN-aware NumPy and SciPy functions (nanmean, nanmin, etc.) > > I'd say yes. > >> - moving window functions (moving_{coun

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 3:16 PM, Wes McKinney wrote: > What would you say to a single package that contains: > > - NaN-aware NumPy and SciPy functions (nanmean, nanmin, etc.) I'd say yes. > - moving window functions (moving_{count, sum, mean, var, std, etc.}) Yes. BTW, we both do arr=arr.asty

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Wes McKinney
On Sun, Nov 21, 2010 at 6:02 PM, wrote: > On Sun, Nov 21, 2010 at 5:09 PM, Keith Goodman wrote: >> On Sun, Nov 21, 2010 at 12:30 PM,   wrote: >>> On Sun, Nov 21, 2010 at 2:48 PM, Keith Goodman wrote: On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: > On Sat, Nov 20, 2010 at 7:24

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread josef . pktd
On Sun, Nov 21, 2010 at 5:09 PM, Keith Goodman wrote: > On Sun, Nov 21, 2010 at 12:30 PM,   wrote: >> On Sun, Nov 21, 2010 at 2:48 PM, Keith Goodman wrote: >>> On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: > On Sat, Nov 20,

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 12:30 PM, wrote: > On Sun, Nov 21, 2010 at 2:48 PM, Keith Goodman wrote: >> On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: >>> On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: > Keith (and o

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Wes McKinney
On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: > On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: > >> Keith (and others), >> >> What would you think about creating a library of mostly Cython-based >> "domain specific functions"? So stuff like rolling statistical >> moments, nan* funct

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread josef . pktd
On Sun, Nov 21, 2010 at 2:48 PM, Keith Goodman wrote: > On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: >> On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: >>> On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: >>> Keith (and others), What would you think about creat

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: > On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: >> On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: >> >>> Keith (and others), >>> >>> What would you think about creating a library of mostly Cython-based >>> "domain specific funct

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Charles R Harris
On Sat, Nov 20, 2010 at 4:39 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman > wrote: > > I should make a benchmark suite. > > >> ny.benchit(verbose=False) > Nanny performance benchmark >Nanny 0.0.1dev >Numpy 1.4.1 >Speed is numpy time divided by nanny time

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Keith Goodman
On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: > Keith (and others), > > What would you think about creating a library of mostly Cython-based > "domain specific functions"? So stuff like rolling statistical > moments, nan* functions like you have here, and all that-- NumPy-array > only func

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Wes McKinney
On Sat, Nov 20, 2010 at 6:54 PM, Wes McKinney wrote: > On Sat, Nov 20, 2010 at 6:39 PM, Keith Goodman wrote: >> On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman wrote: >>> I should make a benchmark suite. >> ny.benchit(verbose=False) >> Nanny performance benchmark >>    Nanny 0.0.1dev >>    N

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Wes McKinney
On Sat, Nov 20, 2010 at 6:39 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman wrote: >> I should make a benchmark suite. > >>> ny.benchit(verbose=False) > Nanny performance benchmark >    Nanny 0.0.1dev >    Numpy 1.4.1 >    Speed is numpy time divided by nanny time >    

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman wrote: > I should make a benchmark suite. >> ny.benchit(verbose=False) Nanny performance benchmark Nanny 0.0.1dev Numpy 1.4.1 Speed is numpy time divided by nanny time NaN means all NaNs Speed TestShapedty

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 8:33 PM, wrote: -np.inf>-np.inf > False > > If the only value is -np.inf, you will return nan, I guess. > np.nanmax([-np.inf, np.nan]) > -inf That's a great corner case. Thanks, Josef. This looks like it would fix it: change if ai > amax: amax = ai to i

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread josef . pktd
On Fri, Nov 19, 2010 at 10:59 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 7:51 PM,   wrote: >> >> does this give you the correct answer? >> > 1>np.nan >> False >> >> What's the starting value for amax? -inf? > > Because "1 > np.nan" is False, the current running max does not get > updat

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 8:05 PM, Charles R Harris wrote: > This doesn't look right: > > @cython.boundscheck(False) > @cython.wraparound(False) > def nanmax_2d_float64_axisNone(np.ndarray[np.float64_t, ndim=2] a): > "nanmax of 2d numpy array with dtype=np.float64 along axis=None." > cdef P

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Charles R Harris
On Fri, Nov 19, 2010 at 8:42 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 7:19 PM, Charles R Harris > wrote: > > > > > > On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman > wrote: > >> > >> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman > >> wrote: > >> > On Fri, Nov 19, 2010 at 12:19 PM,

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:51 PM, wrote: > > does this give you the correct answer? > 1>np.nan > False > > What's the starting value for amax? -inf? Because "1 > np.nan" is False, the current running max does not get updated, which is what we want. >> import nanny as ny >> np.nanmax([1, np.

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Nathaniel Smith
On Fri, Nov 19, 2010 at 7:51 PM, wrote: > On Fri, Nov 19, 2010 at 10:42 PM, Keith Goodman wrote: >> It basically loops through the data and does: >> >> allnan = 1 >> ai = ai[i,k] >> if ai > amax: >>    amax = ai >>    allnan = 0 > > does this give you the correct answer? > 1>np.nan > False

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread josef . pktd
On Fri, Nov 19, 2010 at 10:42 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 7:19 PM, Charles R Harris > wrote: >> >> >> On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman wrote: >>> >>> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman >>> wrote: >>> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Vir

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:19 PM, Charles R Harris wrote: > > > On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman wrote: >> >> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman >> wrote: >> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: >> >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Charles R Harris
On Fri, Nov 19, 2010 at 8:19 PM, Charles R Harris wrote: > > > On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman wrote: > >> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman >> wrote: >> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: >> >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman w

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Charles R Harris
On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman > wrote: > > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: > >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: > >> [clip] > >>> My guess is that having separate underlying f

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Charles R Harris
On Fri, Nov 19, 2010 at 3:18 PM, Christopher Barker wrote: > On 11/19/10 11:19 AM, Keith Goodman wrote: > > On Fri, Nov 19, 2010 at 10:55 AM, Nathaniel Smith wrote: > >> Why not make this a patch to numpy/scipy instead? > > > > My guess is that having separate underlying functions for each dtype,

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Christopher Barker
On 11/19/10 11:19 AM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 10:55 AM, Nathaniel Smith wrote: >> Why not make this a patch to numpy/scipy instead? > > My guess is that having separate underlying functions for each dtype, > ndim, and axis would be a nightmare for a large project like Numpy.

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: >> [clip] >>> My guess is that having separate underlying functions for each dtype, >>> ndim, and axis would be a nightmare for

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: > Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: > [clip] >> My guess is that having separate underlying functions for each dtype, >> ndim, and axis would be a nightmare for a large project like Numpy. But >> manageable for a focused p

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:10 PM, wrote: > What's the speed advantage of nanny compared to np.nansum that you > have if the arrays are larger, say (1000,10) or (1,100) axis=0 ? Good point. In the small examples I showed so far maybe the speed up was all in overhead. Fortunately, that's not

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Pauli Virtanen
Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: [clip] > My guess is that having separate underlying functions for each dtype, > ndim, and axis would be a nightmare for a large project like Numpy. But > manageable for a focused project like nanny. Might be easier to migrate the nan* function

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread josef . pktd
On Fri, Nov 19, 2010 at 2:35 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 11:12 AM, Benjamin Root wrote: > >> That's why I use masked arrays.  It is dtype agnostic. >> >> I am curious if there are any lessons that were learned in making Nanny that >> could be applied to the masked array fun

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 11:12 AM, Benjamin Root wrote: > That's why I use masked arrays.  It is dtype agnostic. > > I am curious if there are any lessons that were learned in making Nanny that > could be applied to the masked array functions? I suppose you could write a cython function that oper

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 10:55 AM, Nathaniel Smith wrote: > On Fri, Nov 19, 2010 at 10:33 AM, Keith Goodman wrote: >> Nanny uses the magic of Cython to give you a faster, drop-in replacement for >> the NaN functions in NumPy and SciPy. > > Neat! > > Why not make this a patch to numpy/scipy instead

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Benjamin Root
On Fri, Nov 19, 2010 at 12:55 PM, Nathaniel Smith wrote: > On Fri, Nov 19, 2010 at 10:33 AM, Keith Goodman > wrote: > > Nanny uses the magic of Cython to give you a faster, drop-in replacement > for > > the NaN functions in NumPy and SciPy. > > Neat! > > Why not make this a patch to numpy/scipy

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Nathaniel Smith
On Fri, Nov 19, 2010 at 10:33 AM, Keith Goodman wrote: > Nanny uses the magic of Cython to give you a faster, drop-in replacement for > the NaN functions in NumPy and SciPy. Neat! Why not make this a patch to numpy/scipy instead? > Nanny uses a separate Cython function for each combination of n

[Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
= Nanny = Nanny uses the magic of Cython to give you a faster, drop-in replacement for the NaN functions in NumPy and SciPy. For example:: >> import nanny as ny >> import numpy as np >> arr = np.random.rand(100, 100) >> timeit np.nansum(arr) 1 loops, best of 3: 6