Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Nathaniel Smith
On Thu, Jun 30, 2011 at 12:27 PM, Eric Firing wrote: > On 06/30/2011 08:53 AM, Nathaniel Smith wrote: >> On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing  wrote: >>> In addition, for new code, the full-blown masked array module may not be >>> needed.  A convenience it adds, however, is the automatic m

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Eric Firing
On 06/30/2011 08:53 AM, Nathaniel Smith wrote: > On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing wrote: >> In addition, for new code, the full-blown masked array module may not be >> needed. A convenience it adds, however, is the automatic masking of >> invalid values: >> >> In [1]: np.ma.log(-1) >>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing wrote: > In addition, for new code, the full-blown masked array module may not be > needed.  A convenience it adds, however, is the automatic masking of > invalid values: > > In [1]: np.ma.log(-1) > Out[1]: masked > > I'm sure this horrifies some, but t

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:54 AM, Lluís wrote: > Mark Wiebe writes: > > Why is one "magic" and the other "real"? All of this is already > > sitting on 100 layers of abstraction above electrons and atoms. If > > we're talking about "real," maybe we should be programming in machine > > code or usin

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:42 AM, Matthew Brett wrote: > Hi, > > On Thu, Jun 30, 2011 at 5:13 PM, Mark Wiebe wrote: > > On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman > > wrote: > >> > >>> Clearly there are some overlaps between what masked arrays are > >>> trying to achieve and what

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Lluís
Mark Wiebe writes: > Why is one "magic" and the other "real"? All of this is already > sitting on 100 layers of abstraction above electrons and atoms. If > we're talking about "real," maybe we should be programming in machine > code or using breadboards with individual transistors. M-x butterfly R

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Matthew Brett
Hi, On Thu, Jun 30, 2011 at 5:13 PM, Mark Wiebe wrote: > On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman > wrote: >> >>>      Clearly there are some overlaps between what masked arrays are >>>      trying to achieve and what Rs NA mechanisms are trying to achieve. >>>       Are they really simi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman wrote: > > Clearly there are some overlaps between what masked arrays are >> trying to achieve and what Rs NA mechanisms are trying to achieve. >> Are they really similar enough that they should function using >> the same API?

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Lluís
Mark Wiebe writes: > On Wed, Jun 29, 2011 at 1:20 PM, Lluís wrote: > [...] >> As far as I can tell, the only required difference between them is >> that NA bit patterns must destroy the data. Nothing else. Everything >> on top of that is a choice of API and interface mechanisms. I want >> the

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Gary Strangman
Clearly there are some overlaps between what masked arrays are trying to achieve and what Rs NA mechanisms are trying to achieve.  Are they really similar enough that they should function using the same API? Yes. And if so, won't that be confusing? No, I don't be

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 1:49 AM, Chris Barker wrote: > On 6/27/11 9:53 AM, Charles R Harris wrote: > > Some discussion of disk storage might also help. I don't see how the > > rules can be enforced if two files are used, one for the mask and > > another for the data, but that may just be somethin

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 5:42 PM, Nathaniel Smith wrote: > On Wed, Jun 29, 2011 at 2:40 PM, Lluís wrote: > > I'm for the option of having a single API when you want to have NA > > elements, regardless of whether it's using masks or bit patterns. > > I understand the desire to avoid having two dif

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 4:21 PM, Eric Firing wrote: > On 06/29/2011 09:32 AM, Matthew Brett wrote: > > Hi, > > > [...] > > > > Clearly there are some overlaps between what masked arrays are trying > > to achieve and what Rs NA mechanisms are trying to achieve. Are they > > really similar enough

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:20 PM, Lluís wrote: > Mark Wiebe writes: > > > There seems to be a general idea that masks and NA bit patterns imply > > particular differing semantics, something which I think is simply > > false. > > Well, my example contained a difference (the need for the "skipna=Tru

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 2:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:07 PM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/29/2011 07:38 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn > > mailto:d.s.seljeb...@astro.uio.no>> wrote: > > > > On 06/29/2011 03:45 PM, Matthew Brett wrote: >

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Chris Barker
On 6/27/11 9:53 AM, Charles R Harris wrote: > Some discussion of disk storage might also help. I don't see how the > rules can be enforced if two files are used, one for the mask and > another for the data, but that may just be something we need to live with. It seems it wouldn't be too big deal

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 2:40 PM, Lluís wrote: > I'm for the option of having a single API when you want to have NA > elements, regardless of whether it's using masks or bit patterns. I understand the desire to avoid having two different APIS... [snip] > My concern is now about how to set the "sk

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Nathaniel Smith writes: > I know that the part 1 of that proposal would satisfy my needs, but I > don't know as much about your use case, so I'm curious. Would that > proposal (in particular, part 2, the classic masked-array part) work > for you? I'm for the option of having a single API when you

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Eric Firing
On 06/29/2011 09:32 AM, Matthew Brett wrote: > Hi, > [...] > > Clearly there are some overlaps between what masked arrays are trying > to achieve and what Rs NA mechanisms are trying to achieve. Are they > really similar enough that they should function using the same API? > And if so, won't that

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 11:20 AM, Lluís wrote: > I completely agree. What I'd suggest is a global and/or per-object > "ndarray.flags.skipna" for people like me that just want to ignore these > entries without caring about setting it on each operaion (or the other > way around, depends on the defau

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 9:17 PM, Charles R Harris wrote: > > > On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> >> >> Matthew Brett writes: >> >> >> >> >> Mayb

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 7:20 PM, Lluís wrote: > Mark Wiebe writes: > >> There seems to be a general idea that masks and NA bit patterns imply >> particular differing semantics, something which I think is simply >> false. > > Well, my example contained a difference (the need for the "skipna=Tr

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Oops, On Wed, Jun 29, 2011 at 8:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >>> >>> Matthew Brett writes: >>> >>> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >>> >> the ide

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> Matthew Brett writes: >> >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >> >> the idea that the entry is still there, but we're just ignoring it.  Of >> >> co

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Bruce Southey
On 06/29/2011 01:07 PM, Dag Sverre Seljebotn wrote: > On 06/29/2011 07:38 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn >> mailto:d.s.seljeb...@astro.uio.no>> wrote: >> >> On 06/29/2011 03:45 PM, Matthew Brett wrote: >> > Hi, >> > >> > On W

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Mark Wiebe writes: > There seems to be a general idea that masks and NA bit patterns imply > particular differing semantics, something which I think is simply > false. Well, my example contained a difference (the need for the "skipna=True" argument) precisely because it seemed that there was some

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/29/2011 07:38 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn > mailto:d.s.seljeb...@astro.uio.no>> wrote: > > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > >> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > >> wrote: > >>> > >>> Hi, > >>> > >>>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:45 AM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > >> ... > >> > (You might think, wha

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > Matthew Brett writes: > > >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> the idea that the entry is still there, but we're just ignoring it. Of > >> course, that goes against common convention, but it might be easier

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 2:26 AM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/27/2011 05:55 PM, Mark Wiebe wrote: > > First I'd like to thank everyone for all the feedback you're providing, > > clearly this is an important topic to many people, and the discussion > > has helpe

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Pierre GM
Matthew, Dag, +1. On Jun 29, 2011 4:35 PM, "Dag Sverre Seljebotn" wrote: > On 06/29/2011 03:45 PM, Matthew Brett wrote: >> Hi, >> >> On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: >>> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett >>> wrote: Hi, On Tue, Jun 28, 2011 at 4:0

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/29/2011 03:45 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: >> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: >>> ... (You might think, what difference does it

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: >> ... >> > (You might think, what difference does it make if you *can* unmask an >> > item? Us missing data

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Matthew Brett writes: >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >> the idea that the entry is still there, but we're just ignoring it.  Of >> course, that goes against common convention, but it might be easier to >> explain. > I think Nathaniel's point is that np.IG

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/27/2011 05:55 PM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion > has helped clarify the ideas for me. I've renamed and updated the NEP, > then placed it into the master NumPy

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Dag Sverre Seljebotn
On 06/28/2011 11:52 PM, Matthew Brett wrote: > Hi, > > On Tue, Jun 28, 2011 at 5:38 PM, Charles R Harris > wrote: >> Nathaniel, an implementation using masks will look *exactly* like an >> implementation using na-dtypes from the user's point of view. Except that >> taking a masked view of an unma

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 6:57 PM, Pierre GM wrote: > > On Jun 29, 2011, at 1:39 AM, Mark Wiebe wrote: > > > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > wrote: > > Hi, > > > > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > > ... > > > (You might think, what difference does it make

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 6:56 PM, Pierre GM wrote: > > On Jun 29, 2011, at 1:37 AM, Mark Wiebe wrote: > > > On Tue, Jun 28, 2011 at 3:45 PM, Pierre GM wrote: > > ... > > > > I think that would really take care of the missing data part in a > consistent and non-ambiguous way. > > However, I unders

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 6:55 PM, Nathaniel Smith wrote: > On Tue, Jun 28, 2011 at 4:37 PM, Mark Wiebe wrote: > > I've nearly finished this parameter, and decided to call it 'where' > instead, > > because it is operating like an SQL where clause. Here if neither a nor b > > are masked array it wi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Pierre GM
On Jun 29, 2011, at 1:39 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > wrote: > Hi, > > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > ... > > (You might think, what difference does it make if you *can* unmask an > > item? Us missing data folks could just

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Pierre GM
On Jun 29, 2011, at 1:37 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 3:45 PM, Pierre GM wrote: > ... > > I think that would really take care of the missing data part in a consistent > and non-ambiguous way. > However, I understand that if a choice would be made, this approach would be >

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Nathaniel Smith
On Tue, Jun 28, 2011 at 4:37 PM, Mark Wiebe wrote: > I've nearly finished this parameter, and decided to call it 'where' instead, > because it is operating like an SQL where clause. Here if neither a nor b > are masked array it will only modify those values of b where the 'where' > parameter has t

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 6:00 PM, Matthew Brett wrote: > Hi, > > On Tue, Jun 28, 2011 at 11:40 PM, Jason Grout > wrote: > > On 6/28/11 5:20 PM, Matthew Brett wrote: > >> Hi, > >> > >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > >> ... > >>> (You might think, what difference does it

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett wrote: > Hi, > > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > ... > > (You might think, what difference does it make if you *can* unmask an > > item? Us missing data folks could just ignore this feature. But: > > whatever we end up imple

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 3:45 PM, Pierre GM wrote: > All, > I'm not sure I understand some aspects of Mark's new proposal, sorry (blame > the lack of sleep). > I'm pretty excited with the idea of built-in NA like > np.dtype(NA['float64']), provided we can come with some shortcuts like > np.nafloat

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 2:41 PM, Eric Firing wrote: > On 06/28/2011 07:26 AM, Nathaniel Smith wrote: > > On Tue, Jun 28, 2011 at 9:38 AM, Charles R Harris > > wrote: > >> Nathaniel, an implementation using masks will look *exactly* like an > >> implementation using na-dtypes from the user's poi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 10:06 AM, Nathaniel Smith wrote: > On Mon, Jun 27, 2011 at 2:03 PM, Mark Wiebe wrote: > > On Mon, Jun 27, 2011 at 12:18 PM, Matthew Brett > > > wrote: > >> You won't get complaints, you'll just lose a group of users, who will, > >> I suspect, stick to NaNs, unsatisfactor

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread eat
Hi, On Wed, Jun 29, 2011 at 1:40 AM, Jason Grout wrote: > On 6/28/11 5:20 PM, Matthew Brett wrote: > > Hi, > > > > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > > ... > >> (You might think, what difference does it make if you *can* unmask an > >> item? Us missing data folks could jus

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Matthew Brett
Hi, On Tue, Jun 28, 2011 at 11:40 PM, Jason Grout wrote: > On 6/28/11 5:20 PM, Matthew Brett wrote: >> Hi, >> >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith  wrote: >> ... >>> (You might think, what difference does it make if you *can* unmask an >>> item? Us missing data folks could just ign

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Jason Grout
On 6/28/11 5:20 PM, Matthew Brett wrote: > Hi, > > On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > ... >> (You might think, what difference does it make if you *can* unmask an >> item? Us missing data folks could just ignore this feature. But: >> whatever we end up implementing is someth

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Matthew Brett
Hi, On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: ... > (You might think, what difference does it make if you *can* unmask an > item? Us missing data folks could just ignore this feature. But: > whatever we end up implementing is something that I will have to > explain over and over to

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Matthew Brett
Hi, On Tue, Jun 28, 2011 at 8:41 PM, Eric Firing wrote: > On 06/28/2011 07:26 AM, Nathaniel Smith wrote: >> On Tue, Jun 28, 2011 at 9:38 AM, Charles R Harris >>  wrote: >>> Nathaniel, an implementation using masks will look *exactly* like an >>> implementation using na-dtypes from the user's poi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Matthew Brett
Hi, On Tue, Jun 28, 2011 at 5:38 PM, Charles R Harris wrote: > Nathaniel, an implementation using masks will look *exactly* like an > implementation using na-dtypes from the user's point of view. Except that > taking a masked view of an unmasked array allows ignoring values without > destroying o

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Pierre GM
All, I'm not sure I understand some aspects of Mark's new proposal, sorry (blame the lack of sleep). I'm pretty excited with the idea of built-in NA like np.dtype(NA['float64']), provided we can come with some shortcuts like np.nafloat64. I think that would really take care of the missing data p

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Pierre GM
On Jun 28, 2011, at 9:41 PM, Eric Firing wrote: > > One of the real frustrations of the present masked array is that there > is no savez/load support. I could roll my own by using a convention > like saving the mask of xxx as xxx__mask__, and then reversing the > process in a modified load; b

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Nathaniel Smith
On Tue, Jun 28, 2011 at 12:41 PM, Eric Firing wrote: > I think you are exaggerating some of the differences associated with the > implementation, and ignoring one *key* difference: for integer types, > the masked implementation can handle the full numeric range of the type, > while the bit-pattern

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Eric Firing
On 06/28/2011 07:26 AM, Nathaniel Smith wrote: > On Tue, Jun 28, 2011 at 9:38 AM, Charles R Harris > wrote: >> Nathaniel, an implementation using masks will look *exactly* like an >> implementation using na-dtypes from the user's point of view. Except that >> taking a masked view of an unmasked a

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Nathaniel Smith
On Tue, Jun 28, 2011 at 9:38 AM, Charles R Harris wrote: > Nathaniel, an implementation using masks will look *exactly* like an > implementation using na-dtypes from the user's point of view. Except that > taking a masked view of an unmasked array allows ignoring values without > destroying or cop

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Charles R Harris
On Tue, Jun 28, 2011 at 9:06 AM, Nathaniel Smith wrote: > On Mon, Jun 27, 2011 at 2:03 PM, Mark Wiebe wrote: > > On Mon, Jun 27, 2011 at 12:18 PM, Matthew Brett > > > wrote: > >> You won't get complaints, you'll just lose a group of users, who will, > >> I suspect, stick to NaNs, unsatisfactory

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Nathaniel Smith
On Mon, Jun 27, 2011 at 2:03 PM, Mark Wiebe wrote: > On Mon, Jun 27, 2011 at 12:18 PM, Matthew Brett > wrote: >> You won't get complaints, you'll just lose a group of users, who will, >> I suspect, stick to NaNs, unsatisfactory as they are. > > This blade cuts both ways, we'd lose a group of user

Re: [Numpy-discussion] missing data discussion round 2

2011-06-28 Thread Matthew Brett
Hi, On Mon, Jun 27, 2011 at 10:03 PM, Mark Wiebe wrote: > On Mon, Jun 27, 2011 at 12:18 PM, Matthew Brett ... >> That seems like a risky strategy to me, as the most likely outcome is >> that people worried about memory will avoid masked arrays because they >> know they use more memory.  The memo

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Mark Wiebe
On Mon, Jun 27, 2011 at 7:07 PM, Keith Goodman wrote: > On Mon, Jun 27, 2011 at 8:55 AM, Mark Wiebe wrote: > > First I'd like to thank everyone for all the feedback you're providing, > > clearly this is an important topic to many people, and the discussion has > > helped clarify the ideas for me

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Keith Goodman
On Mon, Jun 27, 2011 at 8:55 AM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion has > helped clarify the ideas for me. I've renamed and updated the NEP, then > placed it into the ma

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Pierre GM
On Jun 27, 2011, at 9:59 PM, josef.p...@gmail.com wrote: > > Just a question how things would work with the new model. > How can you implement the "use" keyword from R's cov (or cor), with > minimal data copying > > I think the basic masked array version would (or does) just assign 0 > to the mi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread josef . pktd
On Mon, Jun 27, 2011 at 5:01 PM, Mark Wiebe wrote: > On Mon, Jun 27, 2011 at 2:59 PM, wrote: >> >> On Mon, Jun 27, 2011 at 2:24 PM, eat wrote: >> > >> > >> > On Mon, Jun 27, 2011 at 8:53 PM, Mark Wiebe wrote: >> >> >> >> On Mon, Jun 27, 2011 at 12:44 PM, eat wrote: >> >>> >> >>> Hi, >> >>> >>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Mark Wiebe
On Mon, Jun 27, 2011 at 12:18 PM, Matthew Brett wrote: > Hi, > > On Mon, Jun 27, 2011 at 5:53 PM, Charles R Harris > wrote: > > > > > > On Mon, Jun 27, 2011 at 9:55 AM, Mark Wiebe wrote: > >> > >> First I'd like to thank everyone for all the feedback you're providing, > >> clearly this is an imp

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Mark Wiebe
On Mon, Jun 27, 2011 at 2:59 PM, wrote: > On Mon, Jun 27, 2011 at 2:24 PM, eat wrote: > > > > > > On Mon, Jun 27, 2011 at 8:53 PM, Mark Wiebe wrote: > >> > >> On Mon, Jun 27, 2011 at 12:44 PM, eat wrote: > >>> > >>> Hi, > >>> > >>> On Mon, Jun 27, 2011 at 6:55 PM, Mark Wiebe wrote: > > >

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread josef . pktd
On Mon, Jun 27, 2011 at 2:24 PM, eat wrote: > > > On Mon, Jun 27, 2011 at 8:53 PM, Mark Wiebe wrote: >> >> On Mon, Jun 27, 2011 at 12:44 PM, eat wrote: >>> >>> Hi, >>> >>> On Mon, Jun 27, 2011 at 6:55 PM, Mark Wiebe wrote: First I'd like to thank everyone for all the feedback you're p

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread eat
On Mon, Jun 27, 2011 at 8:53 PM, Mark Wiebe wrote: > On Mon, Jun 27, 2011 at 12:44 PM, eat wrote: > >> Hi, >> >> On Mon, Jun 27, 2011 at 6:55 PM, Mark Wiebe wrote: >> >>> First I'd like to thank everyone for all the feedback you're providing, >>> clearly this is an important topic to many peopl

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Mark Wiebe
On Mon, Jun 27, 2011 at 12:44 PM, eat wrote: > Hi, > > On Mon, Jun 27, 2011 at 6:55 PM, Mark Wiebe wrote: > >> First I'd like to thank everyone for all the feedback you're providing, >> clearly this is an important topic to many people, and the discussion has >> helped clarify the ideas for me.

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread eat
Hi, On Mon, Jun 27, 2011 at 6:55 PM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion has > helped clarify the ideas for me. I've renamed and updated the NEP, then > placed it into

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread eat
On Mon, Jun 27, 2011 at 8:18 PM, Matthew Brett wrote: > Hi, > > On Mon, Jun 27, 2011 at 5:53 PM, Charles R Harris > wrote: > > > > > > On Mon, Jun 27, 2011 at 9:55 AM, Mark Wiebe wrote: > >> > >> First I'd like to thank everyone for all the feedback you're providing, > >> clearly this is an impo

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Matthew Brett
Hi, On Mon, Jun 27, 2011 at 5:53 PM, Charles R Harris wrote: > > > On Mon, Jun 27, 2011 at 9:55 AM, Mark Wiebe wrote: >> >> First I'd like to thank everyone for all the feedback you're providing, >> clearly this is an important topic to many people, and the discussion has >> helped clarify the i

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Charles R Harris
On Mon, Jun 27, 2011 at 9:55 AM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion has > helped clarify the ideas for me. I've renamed and updated the NEP, then > placed it into the m

[Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Mark Wiebe
First I'd like to thank everyone for all the feedback you're providing, clearly this is an important topic to many people, and the discussion has helped clarify the ideas for me. I've renamed and updated the NEP, then placed it into the master NumPy repository so it has a more permanent home here: