Re: [Numpy-discussion] Missing Data

2014-03-26 Thread Charles R Harris
On Wed, Mar 26, 2014 at 5:43 PM, alex wrote: > On Wed, Mar 26, 2014 at 7:22 PM, T J wrote: > > What is the status of: > > > >https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst > > For what it's worth this NEP was written in 2011 by mwiebe who made > 258 numpy commits in 201

Re: [Numpy-discussion] Missing Data

2014-03-26 Thread alex
On Wed, Mar 26, 2014 at 7:22 PM, T J wrote: > What is the status of: > >https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst For what it's worth this NEP was written in 2011 by mwiebe who made 258 numpy commits in 2011, 1 in 2012, and 3 in 2014. According to github, in the la

[Numpy-discussion] Missing Data

2014-03-26 Thread T J
What is the status of: https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst and of missing data in Numpy, more generally? Is np.ma.array still the "state-of-the-art" way to handle missing data? Or has something better and more comprehensive been put together? _

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-14 Thread Richard Hattersley
For what it's worth, I'd prefer ndmasked. As has been mentioned elsewhere, some algorithms can't really cope with missing data. I'd very much rather they fail than silently give incorrect results. Working in the climate prediction business (as with many other domains I'm sure), even the *potential

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-11 Thread Mark Wiebe
On Thu, May 10, 2012 at 10:28 PM, Matthew Brett wrote: > Hi, > > On Thu, May 10, 2012 at 2:43 AM, Nathaniel Smith wrote: > > Hi Matthew, > > > > On Thu, May 10, 2012 at 12:01 AM, Matthew Brett > wrote: > >>> The third proposal is certainly the best one from Cython's perspective; > >>> and I imag

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-11 Thread Travis Oliphant
On May 11, 2012, at 2:13 AM, Fernando Perez wrote: > On Thu, May 10, 2012 at 11:44 PM, Scott Sinclair > wrote: >> That's pretty much how things already work. The documentation is in >> the main source tree and built docs end up at http://docs.scipy.org. >> NEPs live at https://github.com/numpy/n

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-11 Thread Fernando Perez
On Thu, May 10, 2012 at 11:44 PM, Scott Sinclair wrote: > That's pretty much how things already work. The documentation is in > the main source tree and built docs end up at http://docs.scipy.org. > NEPs live at https://github.com/numpy/numpy/tree/master/doc/neps, but > don't get published outside

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Scott Sinclair
On 11 May 2012 08:12, Fernando Perez wrote: > On Thu, May 10, 2012 at 11:03 PM, Scott Sinclair > wrote: >> Having thought about it, a page on the website isn't a bad idea. I've >> added a note pointing to this discussion. The document now appears at >> http://numpy.scipy.org/NA-overview.html > >

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Fernando Perez
On Thu, May 10, 2012 at 11:03 PM, Scott Sinclair wrote: > Having thought about it, a page on the website isn't a bad idea. I've > added a note pointing to this discussion. The document now appears at > http://numpy.scipy.org/NA-overview.html Why not have a separate repo for neps/discussion docs?

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Scott Sinclair
On 11 May 2012 06:57, Travis Oliphant wrote: > > On May 10, 2012, at 3:40 AM, Scott Sinclair wrote: > >> On 9 May 2012 18:46, Travis Oliphant wrote: >>> The document is available here: >>>    https://github.com/numpy/numpy.scipy.org/blob/master/NA-overview.rst >> >> This is orthogonal to the disc

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Travis Oliphant
On May 10, 2012, at 12:21 AM, Charles R Harris wrote: > > > On Wed, May 9, 2012 at 11:05 PM, Benjamin Root wrote: > > > On Wednesday, May 9, 2012, Nathaniel Smith wrote: > > > My only objection to this proposal is that committing to this approach > seems premature. The existing masked arra

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Travis Oliphant
On May 10, 2012, at 3:40 AM, Scott Sinclair wrote: > On 9 May 2012 18:46, Travis Oliphant wrote: >> The document is available here: >>https://github.com/numpy/numpy.scipy.org/blob/master/NA-overview.rst > > This is orthogonal to the discussion, but I'm curious as to why this > discussion do

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Matthew Brett
Hi, On Thu, May 10, 2012 at 2:43 AM, Nathaniel Smith wrote: > Hi Matthew, > > On Thu, May 10, 2012 at 12:01 AM, Matthew Brett > wrote: >>> The third proposal is certainly the best one from Cython's perspective; >>> and I imagine for those writing C extensions against the C API too. >>> Having P

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Nathaniel Smith
Hi Matthew, On Thu, May 10, 2012 at 12:01 AM, Matthew Brett wrote: >> The third proposal is certainly the best one from Cython's perspective; >> and I imagine for those writing C extensions against the C API too. >> Having PyType_Check fail for ndmasked is a very good way of having code >> fail t

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Dag Sverre Seljebotn
On 05/10/2012 06:05 AM, Dag Sverre Seljebotn wrote: > On 05/10/2012 01:01 AM, Matthew Brett wrote: >> Hi, >> >> On Wed, May 9, 2012 at 12:44 PM, Dag Sverre Seljebotn >>wrote: >>> On 05/09/2012 06:46 PM, Travis Oliphant wrote: Hey all, Nathaniel and Mark have worked very hard on a

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Scott Sinclair
On 9 May 2012 18:46, Travis Oliphant wrote: > The document is available here: >    https://github.com/numpy/numpy.scipy.org/blob/master/NA-overview.rst This is orthogonal to the discussion, but I'm curious as to why this discussion document has landed in the website repo? I suppose it's not a re

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-10 Thread Gael Varoquaux
On Wed, May 09, 2012 at 02:35:26PM -0500, Travis Oliphant wrote: >  Basically it buys not forcing *all* NumPy users (on the C-API level) to >now deal with a masked array.    I know this push is a feature that is >part of Mark's intention (as it pushes downstream libraries to think about

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Charles R Harris
On Wed, May 9, 2012 at 11:05 PM, Benjamin Root wrote: > > > On Wednesday, May 9, 2012, Nathaniel Smith wrote: > >> >> >> My only objection to this proposal is that committing to this approach >> seems premature. The existing masked array objects act quite >> differently from numpy.ma, so why do y

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Benjamin Root
On Wednesday, May 9, 2012, Nathaniel Smith wrote: > > > My only objection to this proposal is that committing to this approach > seems premature. The existing masked array objects act quite > differently from numpy.ma, so why do you believe that they're a good > foundation for numpy.ma, and why wi

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Dag Sverre Seljebotn
On 05/10/2012 01:01 AM, Matthew Brett wrote: > Hi, > > On Wed, May 9, 2012 at 12:44 PM, Dag Sverre Seljebotn > wrote: >> On 05/09/2012 06:46 PM, Travis Oliphant wrote: >>> Hey all, >>> >>> Nathaniel and Mark have worked very hard on a joint document to try and >>> explain the current status of th

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Charles R Harris
On Wed, May 9, 2012 at 6:13 PM, Paul Ivanov wrote: > > > On Wed, May 9, 2012 at 3:12 PM, Travis Oliphant wrote: > >> On re-reading, I want to make a couple of things clear: >> >> 1) This "wrap-up" discussion is *only* for what to do for NumPy 1.7 in >> such a way that we don't tie our hands in th

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Paul Ivanov
On Wed, May 9, 2012 at 3:12 PM, Travis Oliphant wrote: > On re-reading, I want to make a couple of things clear: > > 1) This "wrap-up" discussion is *only* for what to do for NumPy 1.7 in > such a way that we don't tie our hands in the future.I do not believe > we can figure out what to do fo

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Nathaniel Smith
Hi Dag, On Wed, May 9, 2012 at 8:44 PM, Dag Sverre Seljebotn wrote: > I'm a heavy user of masks, which are used to make data NA in the > statistical sense. The setting is that we have to mask out the radiation > coming from the Milky Way in full-sky images of the Cosmic Microwave > Background. Th

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Matthew Brett
Hi, On Wed, May 9, 2012 at 12:44 PM, Dag Sverre Seljebotn wrote: > On 05/09/2012 06:46 PM, Travis Oliphant wrote: >> Hey all, >> >> Nathaniel and Mark have worked very hard on a joint document to try and >> explain the current status of the missing-data debate. I think they've >> done an amazing

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Nathaniel Smith
On Wed, May 9, 2012 at 5:46 PM, Travis Oliphant wrote: > Hey all, > > Nathaniel and Mark have worked very hard on a joint document to try and > explain the current status of the missing-data debate.   I think they've > done an amazing job at providing some context, articulating their views and > s

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Charles R Harris
On Wed, May 9, 2012 at 4:12 PM, Travis Oliphant wrote: > On re-reading, I want to make a couple of things clear: > > 1) This "wrap-up" discussion is *only* for what to do for NumPy 1.7 in > such a way that we don't tie our hands in the future.I do not believe > we can figure out what to do fo

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Travis Oliphant
On re-reading, I want to make a couple of things clear: 1) This "wrap-up" discussion is *only* for what to do for NumPy 1.7 in such a way that we don't tie our hands in the future.I do not believe we can figure out what to do for masked arrays in one short week. What happens be

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Charles R Harris
On Wed, May 9, 2012 at 1:35 PM, Travis Oliphant wrote: > My three proposals: >> >> * do nothing and leave things as is >> >> * add a global flag that turns off masked array support by default but >> otherwise leaves things unchanged (I'm still unclear how this would work >> exactly) >> >> * move

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Dag Sverre Seljebotn
On 05/09/2012 06:46 PM, Travis Oliphant wrote: > Hey all, > > Nathaniel and Mark have worked very hard on a joint document to try and > explain the current status of the missing-data debate. I think they've > done an amazing job at providing some context, articulating their views > and suggesting w

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Travis Oliphant
> Mark will you give more details about this proposal?How would the flag > work, what would it modify? > > The idea is inspired in part by the Chrome release cycle, which has a > presentation here: > > https://docs.google.com/present/view?id=dg63dpc6_4d7vkk6ch&pli=1 > > Some quotes: > Feat

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Travis Oliphant
> My three proposals: > > * do nothing and leave things as is > > * add a global flag that turns off masked array support by default but > otherwise leaves things unchanged (I'm still unclear how this would work > exactly) > > * move Mark's "masked ndarray objects" into a n

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Mark Wiebe
On Wed, May 9, 2012 at 2:15 PM, Travis Oliphant wrote: > > On May 9, 2012, at 2:07 PM, Mark Wiebe wrote: > > On Wed, May 9, 2012 at 11:46 AM, Travis Oliphant wrote: > >> Hey all, >> >> Nathaniel and Mark have worked very hard on a joint document to try and >> explain the current status of the mis

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Travis Oliphant
On May 9, 2012, at 2:07 PM, Mark Wiebe wrote: > On Wed, May 9, 2012 at 11:46 AM, Travis Oliphant wrote: > Hey all, > > Nathaniel and Mark have worked very hard on a joint document to try and > explain the current status of the missing-data debate. I think they've done > an amazing job at p

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Mark Wiebe
On Wed, May 9, 2012 at 11:46 AM, Travis Oliphant wrote: > Hey all, > > Nathaniel and Mark have worked very hard on a joint document to try and > explain the current status of the missing-data debate. I think they've > done an amazing job at providing some context, articulating their views and >

Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Charles R Harris
On Wed, May 9, 2012 at 10:46 AM, Travis Oliphant wrote: > Hey all, > > Nathaniel and Mark have worked very hard on a joint document to try and > explain the current status of the missing-data debate. I think they've > done an amazing job at providing some context, articulating their views and >

[Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Travis Oliphant
Hey all, Nathaniel and Mark have worked very hard on a joint document to try and explain the current status of the missing-data debate. I think they've done an amazing job at providing some context, articulating their views and suggesting ways forward in a mutually respectful manner. This

Re: [Numpy-discussion] Missing data again

2012-03-15 Thread Nathaniel Smith
Hi Chuck, I think I let my frustration get the better of me, and the message below is too confrontational. I apologize. I truly would like to understand where you're coming from on this, though, so I'll try to make this more productive. My summary of points that no-one has disagreed with yet is h

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Nathaniel Smith
On Wed, Mar 7, 2012 at 7:39 PM, Benjamin Root wrote: > On Wed, Mar 7, 2012 at 1:26 PM, Nathaniel Smith wrote: >> When it comes to "missing data", bitpatterns can do everything that >> masks can do, are no more complicated to implement, and have better >> performance characteristics. >> > > Not tr

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Nathaniel Smith
On Wed, Mar 7, 2012 at 7:37 PM, Charles R Harris wrote: > > > On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith wrote: >> When it comes to "missing data", bitpatterns can do everything that >> masks can do, are no more complicated to implement, and have better >> performance characteristics. >> >

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Eric Firing
On 03/07/2012 11:15 AM, Pierre Haessig wrote: > Hi, > Le 07/03/2012 20:57, Eric Firing a écrit : >> In other words, good low-level support for numpy.ma functionality? > Coming back to *existing* ma support, I was just wondering whether it > was now possible to "np.save" a masked array. > (I'm using

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Pierre Haessig
Hi, Le 07/03/2012 20:57, Eric Firing a écrit : > In other words, good low-level support for numpy.ma functionality? Coming back to *existing* ma support, I was just wondering whether it was now possible to "np.save" a masked array. (I'm using numpy 1.5) In the end, this is the most annoying problem

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Eric Firing
On 03/07/2012 09:26 AM, Nathaniel Smith wrote: > On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris > wrote: >> On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig >>> Coming back to Travis proposition "bit-pattern approaches to missing >>> data (*at least* for float64 and int32) need to be implemented.

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Matthew Brett
Hi, On Wed, Mar 7, 2012 at 11:37 AM, Charles R Harris wrote: > > > On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith wrote: >> >> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris >> wrote: >> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig >> > >> >> Coming back to Travis proposition "bit-patt

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Benjamin Root
On Wed, Mar 7, 2012 at 1:26 PM, Nathaniel Smith wrote: > On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris > wrote: > > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig > > >> Coming back to Travis proposition "bit-pattern approaches to missing > >> data (*at least* for float64 and int32) need to

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Charles R Harris
On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith wrote: > On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris > wrote: > > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig > > >> Coming back to Travis proposition "bit-pattern approaches to missing > >> data (*at least* for float64 and int32) need to

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Nathaniel Smith
On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris wrote: > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig >> Coming back to Travis proposition "bit-pattern approaches to missing >> data (*at least* for float64 and int32) need to be implemented.", I >> wonder what is the amount of extra work to go

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Charles R Harris
On Wed, Mar 7, 2012 at 11:21 AM, Lluís wrote: > Charles R Harris writes: > [...] > > One inconvenience I have run into with the current API is that is should > be > > easier to clear the mask from an "ignored" value without taking a new > view or > > assigning known data. > > AFAIR, the inability

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Lluís
Charles R Harris writes: [...] > One inconvenience I have run into with the current API is that is should be > easier to clear the mask from an "ignored" value without taking a new view or > assigning known data. AFAIR, the inability to directly access a "mask" attribute was intentional to make bi

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Charles R Harris
On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig wrote: > Hi, > > Thanks you very much for your lights ! > > Le 06/03/2012 21:59, Nathaniel Smith a écrit : > > Right -- R has a very impoverished type system as compared to numpy. > > There's basically four types: "numeric" (meaning double precision >

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Nathaniel Smith
On Wed, Mar 7, 2012 at 4:35 PM, Pierre Haessig wrote: > Hi, > > Thanks you very much for your lights ! > > Le 06/03/2012 21:59, Nathaniel Smith a écrit : >> Right -- R has a very impoverished type system as compared to numpy. >> There's basically four types: "numeric" (meaning double precision >>

Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Pierre Haessig
Hi, Thanks you very much for your lights ! Le 06/03/2012 21:59, Nathaniel Smith a écrit : > Right -- R has a very impoverished type system as compared to numpy. > There's basically four types: "numeric" (meaning double precision > float), "integer", "logical" (boolean), and "character" (string).

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Nathaniel Smith
On Tue, Mar 6, 2012 at 9:14 PM, Ralf Gommers wrote: > On Tue, Mar 6, 2012 at 9:25 PM, Nathaniel Smith wrote: >> On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant >> wrote: >> > Hi all, >> >> Hi Travis, >> >> Thanks for bringing this back up. >> >> Have you looked at the summary from the last threa

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Ralf Gommers
On Tue, Mar 6, 2012 at 9:25 PM, Nathaniel Smith wrote: > On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant > wrote: > > Hi all, > > Hi Travis, > > Thanks for bringing this back up. > > Have you looked at the summary from the last thread? > https://github.com/njsmith/numpy/wiki/NA-discussion-statu

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Nathaniel Smith
On Tue, Mar 6, 2012 at 4:38 PM, Mark Wiebe wrote: > On Tue, Mar 6, 2012 at 5:48 AM, Pierre Haessig > wrote: >> >From a potential user perspective, I feel it would be nice to have NA >> and non-NA cases look as similar as possible. Your code example is >> particularly striking : two different dtyp

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Nathaniel Smith
On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant wrote: > Hi all, Hi Travis, Thanks for bringing this back up. Have you looked at the summary from the last thread? https://github.com/njsmith/numpy/wiki/NA-discussion-status The goal was to try and at least work out what points we all *could* ag

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Mark Wiebe
Hi Pierre, On Tue, Mar 6, 2012 at 5:48 AM, Pierre Haessig wrote: > Hi Mark, > > I went through the NA NEP a few days ago, but only too quickly so that > my question is probably a rather dumb one. It's about the usability of > bitpatter-based NAs, based on your recent post : > > Le 03/03/2012 22:4

Re: [Numpy-discussion] Missing data again

2012-03-06 Thread Pierre Haessig
Hi Mark, I went through the NA NEP a few days ago, but only too quickly so that my question is probably a rather dumb one. It's about the usability of bitpatter-based NAs, based on your recent post : Le 03/03/2012 22:46, Mark Wiebe a écrit : > Also, here's a thought for the usability of NA-float6

Re: [Numpy-discussion] Missing data again

2012-03-03 Thread Skipper Seabold
On Sat, Mar 3, 2012 at 4:46 PM, Mark Wiebe wrote: > On Sat, Mar 3, 2012 at 12:30 PM, Travis Oliphant >> >>        * the reduction operations need to default to "skipna" --- this is >> the most common use case which has been re-inforced again to me today by a >> new user to Python who is using ma

Re: [Numpy-discussion] Missing data again

2012-03-03 Thread Travis Oliphant
> > Mind, Mark only had a few weeks to write code. I think the unfinished state > is a direct function of that. > > I have heard from several users that they will *not use the missing data* in > NumPy as currently implemented, and I can now see why.For better or for > worse, my approach t

Re: [Numpy-discussion] Missing data again

2012-03-03 Thread Charles R Harris
On Sat, Mar 3, 2012 at 1:30 PM, Travis Oliphant wrote: > Hi all, > > I've been thinking a lot about the masked array implementation lately. > I finally had the time to look hard at what has been done and now am of the > opinion that I do not think that 1.7 can be released with the current state >

Re: [Numpy-discussion] Missing data again

2012-03-03 Thread Ralf Gommers
On Sat, Mar 3, 2012 at 9:30 PM, Travis Oliphant wrote: > Hi all, > > I've been thinking a lot about the masked array implementation lately. > I finally had the time to look hard at what has been done and now am of the > opinion that I do not think that 1.7 can be released with the current state >

Re: [Numpy-discussion] Missing data again

2012-03-03 Thread Mark Wiebe
On Sat, Mar 3, 2012 at 12:30 PM, Travis Oliphant wrote: > > > First of all, I want to be clear that I think there is much great work > that has been done in the current missing data code. There are some nice > features in the where clause of the ufunc and the machinery for the > iterator that

[Numpy-discussion] Missing data again

2012-03-03 Thread Travis Oliphant
Hi all, I've been thinking a lot about the masked array implementation lately. I finally had the time to look hard at what has been done and now am of the opinion that I do not think that 1.7 can be released with the current state of the masked array implementation *unless* it is clearly m

[Numpy-discussion] Missing Data development plan

2011-07-07 Thread Mark Wiebe
It's been a day less than two weeks since I posted my first feedback request on a masked array implementation of missing data. I'd like to thank everyone that contributed to the discussion, and that continues to contribute. I believe my design is very solid thanks to all the feedback, and I unders

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Nathaniel Smith
On Thu, Jun 30, 2011 at 12:27 PM, Eric Firing wrote: > On 06/30/2011 08:53 AM, Nathaniel Smith wrote: >> On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing  wrote: >>> In addition, for new code, the full-blown masked array module may not be >>> needed.  A convenience it adds, however, is the automatic m

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Eric Firing
On 06/30/2011 08:53 AM, Nathaniel Smith wrote: > On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing wrote: >> In addition, for new code, the full-blown masked array module may not be >> needed. A convenience it adds, however, is the automatic masking of >> invalid values: >> >> In [1]: np.ma.log(-1) >>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 2:21 PM, Eric Firing wrote: > In addition, for new code, the full-blown masked array module may not be > needed.  A convenience it adds, however, is the automatic masking of > invalid values: > > In [1]: np.ma.log(-1) > Out[1]: masked > > I'm sure this horrifies some, but t

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:54 AM, Lluís wrote: > Mark Wiebe writes: > > Why is one "magic" and the other "real"? All of this is already > > sitting on 100 layers of abstraction above electrons and atoms. If > > we're talking about "real," maybe we should be programming in machine > > code or usin

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:42 AM, Matthew Brett wrote: > Hi, > > On Thu, Jun 30, 2011 at 5:13 PM, Mark Wiebe wrote: > > On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman > > wrote: > >> > >>> Clearly there are some overlaps between what masked arrays are > >>> trying to achieve and what

Re: [Numpy-discussion] missing data: semantics

2011-06-30 Thread Charles R Harris
On Thu, Jun 30, 2011 at 11:51 AM, Matthew Brett wrote: > Hi, > > On Thu, Jun 30, 2011 at 6:46 PM, Lluís wrote: > > Ok, I think it's time to step back and reformulate the problem by > > completely ignoring the implementation. > > > > Here we have 2 "generic" concepts (i.e., applicable to R), plus

Re: [Numpy-discussion] missing data: semantics

2011-06-30 Thread Charles R Harris
On Thu, Jun 30, 2011 at 11:46 AM, Lluís wrote: > Ok, I think it's time to step back and reformulate the problem by > completely ignoring the implementation. > > Here we have 2 "generic" concepts (i.e., applicable to R), plus another > extra concept that is exclusive to numpy: > > * Assigning np.N

Re: [Numpy-discussion] missing data: semantics

2011-06-30 Thread Matthew Brett
Hi, On Thu, Jun 30, 2011 at 6:46 PM, Lluís wrote: > Ok, I think it's time to step back and reformulate the problem by > completely ignoring the implementation. > > Here we have 2 "generic" concepts (i.e., applicable to R), plus another > extra concept that is exclusive to numpy: > > * Assigning n

[Numpy-discussion] missing data: semantics

2011-06-30 Thread Lluís
Ok, I think it's time to step back and reformulate the problem by completely ignoring the implementation. Here we have 2 "generic" concepts (i.e., applicable to R), plus another extra concept that is exclusive to numpy: * Assigning np.NA to an array, cannot be undone unless through explicit ass

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Lluís
Mark Wiebe writes: > Why is one "magic" and the other "real"? All of this is already > sitting on 100 layers of abstraction above electrons and atoms. If > we're talking about "real," maybe we should be programming in machine > code or using breadboards with individual transistors. M-x butterfly R

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Matthew Brett
Hi, On Thu, Jun 30, 2011 at 5:13 PM, Mark Wiebe wrote: > On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman > wrote: >> >>>      Clearly there are some overlaps between what masked arrays are >>>      trying to achieve and what Rs NA mechanisms are trying to achieve. >>>       Are they really simi

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 11:04 AM, Gary Strangman wrote: > > Clearly there are some overlaps between what masked arrays are >> trying to achieve and what Rs NA mechanisms are trying to achieve. >> Are they really similar enough that they should function using >> the same API?

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Lluís
Mark Wiebe writes: > On Wed, Jun 29, 2011 at 1:20 PM, Lluís wrote: > [...] >> As far as I can tell, the only required difference between them is >> that NA bit patterns must destroy the data. Nothing else. Everything >> on top of that is a choice of API and interface mechanisms. I want >> the

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Gary Strangman
Clearly there are some overlaps between what masked arrays are trying to achieve and what Rs NA mechanisms are trying to achieve.  Are they really similar enough that they should function using the same API? Yes. And if so, won't that be confusing? No, I don't be

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Thu, Jun 30, 2011 at 1:49 AM, Chris Barker wrote: > On 6/27/11 9:53 AM, Charles R Harris wrote: > > Some discussion of disk storage might also help. I don't see how the > > rules can be enforced if two files are used, one for the mask and > > another for the data, but that may just be somethin

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 5:42 PM, Nathaniel Smith wrote: > On Wed, Jun 29, 2011 at 2:40 PM, Lluís wrote: > > I'm for the option of having a single API when you want to have NA > > elements, regardless of whether it's using masks or bit patterns. > > I understand the desire to avoid having two dif

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 4:21 PM, Eric Firing wrote: > On 06/29/2011 09:32 AM, Matthew Brett wrote: > > Hi, > > > [...] > > > > Clearly there are some overlaps between what masked arrays are trying > > to achieve and what Rs NA mechanisms are trying to achieve. Are they > > really similar enough

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:20 PM, Lluís wrote: > Mark Wiebe writes: > > > There seems to be a general idea that masks and NA bit patterns imply > > particular differing semantics, something which I think is simply > > false. > > Well, my example contained a difference (the need for the "skipna=Tru

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 2:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea

Re: [Numpy-discussion] missing data discussion round 2

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:07 PM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/29/2011 07:38 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn > > mailto:d.s.seljeb...@astro.uio.no>> wrote: > > > > On 06/29/2011 03:45 PM, Matthew Brett wrote: >

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Chris Barker
On 6/27/11 9:53 AM, Charles R Harris wrote: > Some discussion of disk storage might also help. I don't see how the > rules can be enforced if two files are used, one for the mask and > another for the data, but that may just be something we need to live with. It seems it wouldn't be too big deal

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 2:40 PM, Lluís wrote: > I'm for the option of having a single API when you want to have NA > elements, regardless of whether it's using masks or bit patterns. I understand the desire to avoid having two different APIS... [snip] > My concern is now about how to set the "sk

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Nathaniel Smith writes: > I know that the part 1 of that proposal would satisfy my needs, but I > don't know as much about your use case, so I'm curious. Would that > proposal (in particular, part 2, the classic masked-array part) work > for you? I'm for the option of having a single API when you

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Eric Firing
On 06/29/2011 09:32 AM, Matthew Brett wrote: > Hi, > [...] > > Clearly there are some overlaps between what masked arrays are trying > to achieve and what Rs NA mechanisms are trying to achieve. Are they > really similar enough that they should function using the same API? > And if so, won't that

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 11:20 AM, Lluís wrote: > I completely agree. What I'd suggest is a global and/or per-object > "ndarray.flags.skipna" for people like me that just want to ignore these > entries without caring about setting it on each operaion (or the other > way around, depends on the defau

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 9:17 PM, Charles R Harris wrote: > > > On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> >> >> Matthew Brett writes: >> >> >> >> >> Mayb

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 7:20 PM, Lluís wrote: > Mark Wiebe writes: > >> There seems to be a general idea that masks and NA bit patterns imply >> particular differing semantics, something which I think is simply >> false. > > Well, my example contained a difference (the need for the "skipna=Tr

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Oops, On Wed, Jun 29, 2011 at 8:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >>> >>> Matthew Brett writes: >>> >>> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >>> >> the ide

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> Matthew Brett writes: >> >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >> >> the idea that the entry is still there, but we're just ignoring it.  Of >> >> co

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Bruce Southey
On 06/29/2011 01:07 PM, Dag Sverre Seljebotn wrote: > On 06/29/2011 07:38 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn >> mailto:d.s.seljeb...@astro.uio.no>> wrote: >> >> On 06/29/2011 03:45 PM, Matthew Brett wrote: >> > Hi, >> > >> > On W

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Mark Wiebe writes: > There seems to be a general idea that masks and NA bit patterns imply > particular differing semantics, something which I think is simply > false. Well, my example contained a difference (the need for the "skipna=True" argument) precisely because it seemed that there was some

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/29/2011 07:38 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn > mailto:d.s.seljeb...@astro.uio.no>> wrote: > > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > >> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > >> wrote: > >>> > >>> Hi, > >>> > >>>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:45 AM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > >> ... > >> > (You might think, wha

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > Matthew Brett writes: > > >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> the idea that the entry is still there, but we're just ignoring it. Of > >> course, that goes against common convention, but it might be easier

  1   2   >