Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:51 PM, Lluís wrote: > Mark Wiebe writes: > [...] > > I think that deciding on the value of NA signal values boils down to > > this question: should 3rd party code be able to interpret missing > data > > information stored in the separate mask array? > > > I'm

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Lluís
Mark Wiebe writes: [...] > I think that deciding on the value of NA signal values boils down to > this question: should 3rd party code be able to interpret missing data > information stored in the separate mask array? > I'm tossing around some variations of ideas using the iterator

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 11:53 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote: > >> Mark Wiebe writes: >> > The design that's forming is a combination of: >> >> > * Solve the missing data problem >> > * My ideas of what a good solution looks like: >> >* applies to all Nu

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote: > Mark Wiebe writes: > > The design that's forming is a combination of: > > > * Solve the missing data problem > > * My ideas of what a good solution looks like: > >* applies to all NumPy dtypes in a fully general way > >* high-performance, lo

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-28 Thread Lluís
Charles R Harris writes: > I think we may need some standard format for masked data on disk if we > don't go the NA value route. As I see it, the mask array is just some metadata that is attached to the dtype descriptor. I don't know how an ndarray is (un)pickled from disk, but I imagine that eac

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-28 Thread Lluís
Mark Wiebe writes: > The design that's forming is a combination of: > * Solve the missing data problem  > * My ideas of what a good solution looks like: >    * applies to all NumPy dtypes in a fully general way >    * high-performance, low overhead where possible >    * makes the C-level implement

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 3:25 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Sat, Jun 25, 2011 at 03:16:39PM -0500, Mark Wiebe wrote: > >This is why I'm also proposing to add a 'mask=' parameter to ufuncs, > for > >example, to expose the implementation details of the masked

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:44 AM, Wes McKinney wrote: > On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris > wrote: > > On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney > wrote: > >> > >> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Fri, Jun 24, 2011 at 1

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris wrote: > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote: > >> This thread is getting quite long, innit ? >> And I think it's getting a tad confusing, because we're mixing two >> different concepts: missing values and masks. >> There should be s

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Gael Varoquaux
On Sat, Jun 25, 2011 at 03:16:39PM -0500, Mark Wiebe wrote: >This is why I'm also proposing to add a 'mask=' parameter to ufuncs, for >example, to expose the implementation details of the masked array system >to people who need masks but need them to be a bit different. There may be >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 7:00 AM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: > > I'm personally worried that the memory overhead of array.masks will > > make many of us tend to avoid them. I work with images that can > >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:14 AM, Wes McKinney wrote: > On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris > wrote: > > > > > > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney > wrote: > >> > >> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith > wrote: > >> > On Fri, Jun 24, 2011 at 6:57 PM, Be

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 6:29 AM, Pierre GM wrote: > This thread is getting quite long, innit ? > It's tiring, yeah! > And I think it's getting a tad confusing, because we're mixing two > different concepts: missing values and masks. > There should be support for missing values in numpy.core, I

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 6:17 AM, Matthew Brett wrote: > Hi, > > On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe wrote: > > On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney > >> wrote: > >> ... > >> > Perhaps we should m

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 6:00 AM, Matthew Brett wrote: > Hi, > > On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe wrote: > > On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett > ... > >> @Mark - I don't have a clear idea whether you consider the nafloat64 > >> option to be still in play as the first thing

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:06 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote: > > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: > >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: > >>> This is a situation where I would just... use an array a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:59 PM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: > > On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: > >> This is a situation where I would just... use an array and a mask, > >> rather than a masked array. Then lots of thin

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Benjamin Root
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris wrote: > > I think he aims to support both. One complication with masks is keeping > them tied to the data on disk. With na values one file can contain both the > data and the missing data markers, whereas with masks, two files would be > required

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:25 PM, Benjamin Root wrote: > On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe wrote: > >> On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote: >> >>> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Br

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote: > > Another example of how we use masks in matplotlib is in pcolor(). We > have > > to combine the possible masks of X, Y, and V in both the x and y > directions > > to find the final

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:27 PM, Charles R Harris wrote: > > > On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux > wrote: >> >> On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: >> > I'm personally worried that the memory overhead of array.masks will >> > make many of us tend to a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Olivier Delalleau
2011/6/25 Charles R Harris > I think what we really need to see are the use cases and work flow. The > ones that hadn't occurred to me before were memory mapped files and data > stored on disk in general. I think we may need some standard format for > masked data on disk if we don't go the NA val

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 4:05 PM, Charles R Harris wrote: > > > On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris >> wrote: >> > >> > >> > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >>

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:44 PM, Wes McKinney wrote: ... > Here are some things I can think of that would be affected by any changes here > > 1) Right now users of pandas can type pandas.isnull(series[5]) and > that will yield True if the value is NA for any dtype. This might be > hard to sup

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett wrote: > Hi, > > On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris > wrote: > > > > > > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris > >> wrote: > >> > > >> > >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:44 AM, Wes McKinney wrote: > On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris > wrote: > > > > > > On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney > wrote: > >> > >> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Fri, Jun 24, 20

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Gael Varoquaux
On Sat, Jun 25, 2011 at 08:27:57AM -0600, Charles R Harris wrote: >Could you expand a bit on what sort of data you have and how you deal with >it. Where does it come from, how is it stored on disk, what do you do with >it? That sort of thing. 3D and 4D images. Mostly stored on disk in

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris wrote: > > > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris >> wrote: >> > >> > >> > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote: >> >> >> >> This thread is get

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett wrote: > Hi, > > On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris > wrote: > > > > > > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote: > >> > >> This thread is getting quite long, innit ? > >> And I think it's getting a tad confusing, because we'

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Wes McKinney
On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris wrote: > > > On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wrote: >> >> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney >> > wrote: >> >> >> >> On Fri, Jun 24, 2011 at 1

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:14 PM, Wes McKinney wrote: ... > I hope you're right. So far it seems that anyone who has spent real > time with R (e.g. myself, Nathaniel) has expressed serious concerns > about the masked approach. I'm sorry - I have been distracted. For my sake, and because this

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris wrote: > > > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote: >> >> This thread is getting quite long, innit ? >> And I think it's getting a tad confusing, because we're mixing two >> different concepts: missing values and masks. >> There sh

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: > > I'm personally worried that the memory overhead of array.masks will > > make many of us tend to avoid them. I work with images that can > >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wrote: > On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris > wrote: > > > > > > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney > wrote: > >> > >> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith > wrote: > >> > On Fri, Jun 24, 2011 at 6:57 PM, Be

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote: > This thread is getting quite long, innit ? > And I think it's getting a tad confusing, because we're mixing two > different concepts: missing values and masks. > There should be support for missing values in numpy.core, I think we all > agree on

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Wes McKinney
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris wrote: > > > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wrote: >> >> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote: >> > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: >> >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Gael Varoquaux
On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: > I'm personally worried that the memory overhead of array.masks will > make many of us tend to avoid them. I work with images that can > easily get large enough that I would not want an array-items size byte > array added to my storag

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Pierre GM
This thread is getting quite long, innit ? And I think it's getting a tad confusing, because we're mixing two different concepts: missing values and masks. There should be support for missing values in numpy.core, I think we all agree on that. * What's been suggested of adding new dtypes (naflo

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe wrote: > On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney >> wrote: >> ... >> > Perhaps we should make a wiki page someplace summarizing pros and cons >> > of the various implem

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe wrote: > On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett ... >> @Mark - I don't have a clear idea whether you consider the nafloat64 >> option to be still in play as the first thing to be implemented >> (before array.mask).   If it is, what kind of

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Laurent Gautier
On 2011-06-24 17:30, Robert Kern wrote: > On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote: >> > On 2011-06-24 16:43, Robert Kern wrote: >>> >> >>> >> On Fri, Jun 24, 2011 at 09:33, Charles R Harris >>> >>wrote: >>> > >>> > > >>> > ?On Fri, Jun 24, 2011 at 8:06 AM, Robe

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote: > > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: > >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: > >>> This is a situation where I would just... use an array a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: >>> This is a situation where I would just... use an array and a mask, >>> rather than a masked array. Then lots of things -

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote: > On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: >> This is a situation where I would just... use an array and a mask, >> rather than a masked array. Then lots of things -- changing fill >> values, temporarily masking/unmasking things

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Benjamin Root
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote: > > Another example of how we use masks in matplotlib is in pcolor(). We > have > > to combine the possible masks of X, Y, and V in both the x and y > directions > > to find the final

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Benjamin Root
On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe wrote: > On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote: > >> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett > > >> > wrote: >> >> >> >> Hi, >> >> >> >> On Fri, Jun 24, 2

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote: > Another example of how we use masks in matplotlib is in pcolor().  We have > to combine the possible masks of X, Y, and V in both the x and y directions > to find the final mask to use for the final output result (because each > facet needs v

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett wrote: > Hi, > > On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney > wrote: > ... > > Perhaps we should make a wiki page someplace summarizing pros and cons > > of the various implementation approaches? > > But - we should do this if it really is an ope

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris > wrote: > > > > > > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root > wrote: > >> ... > >> > Again, there

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:10 PM, Charles R Harris wrote: > > > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett wrote: > >> Hi, >> >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote: >> ... >> > Again, there are pros and cons either way and I see them very orthogonal >> and >> > complementar

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett wrote: > Hi, > > On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote: > ... > > Again, there are pros and cons either way and I see them very orthogonal > and > > complementary. > > That may be true, but I imagine only one of them will be implement

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 4:24 PM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe wrote: > > For the maybe dtype, it would need to gain access to the ufunc loop of > the > > underlying dtype, and call it appropriately during the inner loop. This > > appears to require some m

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 4:09 PM, Benjamin Root wrote: > > > On Fri, Jun 24, 2011 at 10:40 AM, Mark Wiebe wrote: > >> On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote: >> >>> On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote: >>> Sorry y'all, I'm just commenting bits by bits:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 6:11 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris > wrote: > > > > > > On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney > wrote: > >> > >> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Fri, Jun 24, 2011

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 3:38 PM, Lluís wrote: > Mark Wiebe writes: > > > It's should also be possible to accomplish a general > > solution at the dtype level. We could have a 'dtype > > factory' used like: np.zeros(10, dtype=np.maybe(float)) > > wh

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris wrote: > > > On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wrote: >> >> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> On Fri, Jun

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris > wrote: > > > > > > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root > wrote: > >> ... > >> > Again, there

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wrote: ... > Perhaps we should make a wiki page someplace summarizing pros and cons > of the various implementation approaches? But - we should do this if it really is an open question which one we go for. If not then, we're just slowing Mark

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris wrote: > > > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote: >> ... >> > Again, there are pros and cons either way and I see them very orthogonal >> > and >> > complem

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett wrote: > Hi, > > On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote: > ... > > Again, there are pros and cons either way and I see them very orthogonal > and > > complementary. > > That may be true, but I imagine only one of them will be implement

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Gael Varoquaux
On Thu, Jun 23, 2011 at 07:51:25PM -0400, josef.p...@gmail.com wrote: > From the perspective of statistical analysis, I don't see much > advantage of this. What to do with nans depends on the analysis, and > needs to be looked at for each case. >From someone who actually sometimes does statistics

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote: ... > Again, there are pros and cons either way and I see them very orthogonal and > complementary. That may be true, but I imagine only one of them will be implemented. @Mark - I don't have a clear idea whether you consider the nafloat

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe wrote: > For the maybe dtype, it would need to gain access to the ufunc loop of the > underlying dtype, and call it appropriately during the inner loop. This > appears to require some more invasive upheaval within the ufunc code than > the masking appro

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Benjamin Root
On Fri, Jun 24, 2011 at 10:40 AM, Mark Wiebe wrote: > On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote: > >> On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote: >> >>> Sorry y'all, I'm just commenting bits by bits: >>> >>> "One key problem is a lack of orthogonality with other features, for >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Lluís
Mark Wiebe writes: > It's should also be possible to accomplish a general > solution at the dtype level. We could have a 'dtype > factory' used like:  np.zeros(10, dtype=np.maybe(float)) > where np.maybe(x) returns a new dtype whose storage size >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett wrote: > Hi, > > On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe wrote: > > On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote: > ... > >> and the fact that 'missing_

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 1:04 PM, Matthew Brett wrote: > Hi, > > Just as a use case, if I do this: > > a = np.zeros((big_number,), dtype=np.int32) > a[0,0] = np.NA > > I think I'm right in saying that, with the array.mask implementation > my array memory usage with grow by new big_number bytes, whe

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 12:06 PM, Wes McKinney wrote: > On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe wrote: > > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote: > >> > >> On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote: > >> > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith > wrote:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:54 AM, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe wrote: > > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote: > >> But on the other hand, we gain: > >> -- simpler implementation: no need to be checking and tracking the > >> mask buffe

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe wrote: > On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote: ... >> and the fact that 'missing_value' could be any type would make the >> code more complicated than the cu

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:25 AM, Christopher Barker wrote: > Robert Kern wrote: > > > It's worth noting that this is not a replacement for masked arrays, > > nor is it intended to be the be-all, end-all solution to missing data > > problems. It's mostly just intended to be a focused tool to fill

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:25 AM, Robert Kern wrote: > On Fri, Jun 24, 2011 at 11:13, Christopher Barker > wrote: > > Nathaniel Smith wrote: > > > >> If we think that the memory overhead for floating point types is too > >> high, it would be easy to add a special case where maybe(float) used a >

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:13 AM, Christopher Barker wrote: > Nathaniel Smith wrote: > >> The 'dtype factory' idea builds on the way I've structured datetime as a > >> parameterized type, > > ... > > Another disadvantage is that we get further from Gael Varoquaux's point: > >> Right now, the nump

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, Just as a use case, if I do this: a = np.zeros((big_number,), dtype=np.int32) a[0,0] = np.NA I think I'm right in saying that, with the array.mask implementation my array memory usage with grow by new big_number bytes, whereas with the np.naint32 implementation you'd get something like: Err

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:07 AM, Matthew Brett wrote: > Hi, > > On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern > wrote: > > On Fri, Jun 24, 2011 at 09:33, Charles R Harris > > wrote: > >> > >> On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern > wrote: > > > >>> The alternative proposal would be to ad

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:02 AM, Pierre GM wrote: > On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: > > > On Fri, Jun 24, 2011 at 09:35, Robert Kern > wrote: > >> On Fri, Jun 24, 2011 at 09:24, Keith Goodman > wrote: > >>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern > wrote: > >>> > The

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 9:27 AM, Bruce Southey wrote: > ** > On 06/24/2011 09:06 AM, Robert Kern wrote: > > On Fri, Jun 24, 2011 at 07:30, Laurent Gautier > wrote: > > On 2011-06-24 13:59, Nathaniel Smith > wrote: > > On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Root > wrote: > > Lastly

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe wrote: > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote: >> >> On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote: >> > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith wrote: >> >> It's should also be possible to accomplish a general solution

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:57 AM, Keith Goodman wrote: > On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe wrote: > > On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman > wrote: > >> > >> On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe wrote: > >> > Enthought has asked me to look into the "missing data" prob

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe wrote: > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote: >> But on the other hand, we gain: >>  -- simpler implementation: no need to be checking and tracking the >> mask buffer everywhere. The needed infrastructure is already built in. > > I do

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:01 AM, Neal Becker wrote: > Just 1 question before I look more closely. What is the cost to the non-MA > user > of this addition? > I'm following the idea that you don't pay for what you don't use. All the existing stuff will perform the same. -Mark > __

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 7:30 AM, Laurent Gautier wrote: > On 2011-06-24 13:59, Nathaniel Smith wrote: > > On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Root wrote: > >> Lastly, I am not entirely familiar with R, so I am also very curious > about > >> what this magical "NA" value is, and how it com

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett wrote: > Hi, > > On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote: > ... > > If we think that the memory overhead for floating point types is too > > high, it would be easy to add a special case where maybe(float) used a > > distinguished NaN i

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote: > On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote: > > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith wrote: > >> It's should also be possible to accomplish a general solution at the > >> dtype level. We could have a 'dtype factory' us

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote: > The alternative proposal would be to add a few new dtypes that are > NA-aware. E.g. an nafloat64 would reserve a particular NaN value > (there are lots of different NaN bit patterns, we'd just reserve one) > that would represent NA. An naint32

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Christopher Barker
Robert Kern wrote: > It's worth noting that this is not a replacement for masked arrays, > nor is it intended to be the be-all, end-all solution to missing data > problems. It's mostly just intended to be a focused tool to fill in > the gaps where masked arrays are less convenient for whatever rea

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 11:13, Christopher Barker wrote: > Nathaniel Smith wrote: > >> If we think that the memory overhead for floating point types is too >> high, it would be easy to add a special case where maybe(float) used a >> distinguished NaN instead of a separate boolean. > > That would  

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 8:00 PM, Pierre GM wrote: > > On Jun 24, 2011, at 2:42 AM, Mark Wiebe wrote: > > > On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote: > > Sorry y'all, I'm just commenting bits by bits: > > > > "One key problem is a lack of orthogonality with other features, for > instance

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Christopher Barker
Nathaniel Smith wrote: >> The 'dtype factory' idea builds on the way I've structured datetime as a >> parameterized type, ... Another disadvantage is that we get further from Gael Varoquaux's point: >> Right now, the numpy array can be seen as an extension of the C >> array, basically a pointer

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 8:30 AM, Robert Kern wrote: > I would suggest following R's lead and letting ((NA==NA) == True) > unlike NaNs. In R, NA and NaN do behave differently with respect to ==, but not the way you're saying: > NA == NA [1] NA > if (NA == NA) 1; Error in if (NA == NA) 1 : missing

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 11:05, Nathaniel Smith wrote: > On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern wrote: >> On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote: >>> May be there is not so much need for reservation over the string NA, when >>> making the distinction between: >>> a- the intern

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern wrote: > On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote: >> May be there is not so much need for reservation over the string NA, when >> making the distinction between: >> a- the internal representation of a "missing string" (what is stored in >>

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote: > On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote: > >> Sorry y'all, I'm just commenting bits by bits: >> >> "One key problem is a lack of orthogonality with other features, for >> instance creating a masked array with physical quantities ca

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 10:02, Pierre GM wrote: > > On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: > >> On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote: >>> On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote: On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote: > The alternative

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote: > On 2011-06-24 16:43, Robert Kern wrote: >> >> On Fri, Jun 24, 2011 at 09:33, Charles R Harris >> wrote: >>> >>> > >>> >  On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern >>> >  wrote: >>  The alternative proposal would be to add a few

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern wrote: > On Fri, Jun 24, 2011 at 09:33, Charles R Harris > wrote: >> >> On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern wrote: > >>> The alternative proposal would be to add a few new dtypes that are >>> NA-aware. E.g. an nafloat64 would reserve a p

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Laurent Gautier
On 2011-06-24 16:43, Robert Kern wrote: > On Fri, Jun 24, 2011 at 09:33, Charles R Harris > wrote: >> > >> > On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern >> > wrote: >>> >> The alternative proposal would be to add a few new dtypes that are >>> >> NA-aware. E.g. an nafloat64 would reserve a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Pierre GM
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: > On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote: >> On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote: >>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote: >>> The alternative proposal would be to add a few new dtypes that are N

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 8:44 AM, Robert Kern wrote: > On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote: > > On Fri, Jun 24, 2011 at 09:24, Keith Goodman > wrote: > >> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern > wrote: > >> > >>> The alternative proposal would be to add a few new dtypes that

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote: > On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote: >> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote: >> >>> The alternative proposal would be to add a few new dtypes that are >>> NA-aware. E.g. an nafloat64 would reserve a particular NaN

  1   2   >