Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Chris Barker
On 6/27/11 9:53 AM, Charles R Harris wrote: > Some discussion of disk storage might also help. I don't see how the > rules can be enforced if two files are used, one for the mask and > another for the data, but that may just be something we need to live with. It seems it wouldn't be too big deal

[Numpy-discussion] Two bugs in recfunctions.join_by with patch

2011-06-29 Thread Skipper Seabold
These two cases failed in recfunctions.join_by 1) the case for having either r1postfix or r2postfix as an empty string was not handled. 2) If there is more than one key and more than variable with a name collision. Patch and tests in a pull request here: https://github.com/numpy/numpy/pull/100

Re: [Numpy-discussion] Multiply along axis

2011-06-29 Thread josef . pktd
On Wed, Jun 29, 2011 at 11:05 AM, Robert Elsner wrote: > > Yeah great that was spot-on. And I thought I knew most of the slicing > tricks. I combined it with a slice object so that > > idx_obj = [ None for i in xrange(a.ndim) ] or idx_obj = [None] * a.ndim otherwise this is also what I do quite

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 2:40 PM, Lluís wrote: > I'm for the option of having a single API when you want to have NA > elements, regardless of whether it's using masks or bit patterns. I understand the desire to avoid having two different APIS... [snip] > My concern is now about how to set the "sk

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Nathaniel Smith writes: > I know that the part 1 of that proposal would satisfy my needs, but I > don't know as much about your use case, so I'm curious. Would that > proposal (in particular, part 2, the classic masked-array part) work > for you? I'm for the option of having a single API when you

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Eric Firing
On 06/29/2011 09:32 AM, Matthew Brett wrote: > Hi, > [...] > > Clearly there are some overlaps between what masked arrays are trying > to achieve and what Rs NA mechanisms are trying to achieve. Are they > really similar enough that they should function using the same API? > And if so, won't that

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Nathaniel Smith
On Wed, Jun 29, 2011 at 11:20 AM, Lluís wrote: > I completely agree. What I'd suggest is a global and/or per-object > "ndarray.flags.skipna" for people like me that just want to ignore these > entries without caring about setting it on each operaion (or the other > way around, depends on the defau

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 9:17 PM, Charles R Harris wrote: > > > On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> >> >> Matthew Brett writes: >> >> >> >> >> Mayb

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 7:20 PM, Lluís wrote: > Mark Wiebe writes: > >> There seems to be a general idea that masks and NA bit patterns imply >> particular differing semantics, something which I think is simply >> false. > > Well, my example contained a difference (the need for the "skipna=Tr

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Oops, On Wed, Jun 29, 2011 at 8:32 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >>> >>> Matthew Brett writes: >>> >>> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >>> >> the ide

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: >> >> Matthew Brett writes: >> >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >> >> the idea that the entry is still there, but we're just ignoring it.  Of >> >> co

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Lluís
Mark Wiebe writes: [...] > I think that deciding on the value of NA signal values boils down to > this question: should 3rd party code be able to interpret missing data > information stored in the separate mask array? > I'm tossing around some variations of ideas using the iterator

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Bruce Southey
On 06/29/2011 01:07 PM, Dag Sverre Seljebotn wrote: > On 06/29/2011 07:38 PM, Mark Wiebe wrote: >> On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn >> mailto:d.s.seljeb...@astro.uio.no>> wrote: >> >> On 06/29/2011 03:45 PM, Matthew Brett wrote: >> > Hi, >> > >> > On W

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Mark Wiebe writes: > There seems to be a general idea that masks and NA bit patterns imply > particular differing semantics, something which I think is simply > false. Well, my example contained a difference (the need for the "skipna=True" argument) precisely because it seemed that there was some

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/29/2011 07:38 PM, Mark Wiebe wrote: > On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn > mailto:d.s.seljeb...@astro.uio.no>> wrote: > > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 11:53 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote: > >> Mark Wiebe writes: >> > The design that's forming is a combination of: >> >> > * Solve the missing data problem >> > * My ideas of what a good solution looks like: >> >* applies to all Nu

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote: > Mark Wiebe writes: > > The design that's forming is a combination of: > > > * Solve the missing data problem > > * My ideas of what a good solution looks like: > >* applies to all NumPy dtypes in a fully general way > >* high-performance, lo

Re: [Numpy-discussion] Missing/accumulating data

2011-06-29 Thread Mark Wiebe
Yeah, it takes a long time to wade through and respond to everything. I think the "missing data" problem and weighted masking are closely related, but neither one is fully a subset of the other. With a non-boolean alpha mask, there's an implication of a multiplication operator in there somewhere,

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 9:35 AM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/29/2011 03:45 PM, Matthew Brett wrote: > > Hi, > > > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > >> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > >> wrote: > >>> > >>> Hi, > >>> > >>>

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:45 AM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: > >> ... > >> > (You might think, wha

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 8:20 AM, Lluís wrote: > Matthew Brett writes: > > >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> the idea that the entry is still there, but we're just ignoring it. Of > >> course, that goes against common convention, but it might be easier

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 2:26 AM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote: > On 06/27/2011 05:55 PM, Mark Wiebe wrote: > > First I'd like to thank everyone for all the feedback you're providing, > > clearly this is an important topic to many people, and the discussion > > has helpe

[Numpy-discussion] pull request for testing/review: ufunc 'where=' parameter

2011-06-29 Thread Mark Wiebe
https://github.com/numpy/numpy/pull/99 In [24]: a = np.ones((1000,1000)) In [25]: b = np.ones((1000,1000)) In [26]: c = np.zeros((1000, 1000)) In [27]: m = np.random.rand(1000,1000) > 0.5 In [28]: timeit c[m] = a[m] + b[m] 1 loops, best of 3: 246 ms per loop In [29]: timeit np.add(a, b, out=c, w

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Pierre GM
Matthew, Dag, +1. On Jun 29, 2011 4:35 PM, "Dag Sverre Seljebotn" wrote: > On 06/29/2011 03:45 PM, Matthew Brett wrote: >> Hi, >> >> On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: >>> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett >>> wrote: Hi, On Tue, Jun 28, 2011 at 4:0

Re: [Numpy-discussion] Multiply along axis

2011-06-29 Thread Robert Elsner
Yeah great that was spot-on. And I thought I knew most of the slicing tricks. I combined it with a slice object so that idx_obj = [ None for i in xrange(a.ndim) ] idx_obj[axis] = slice(None) a * x[idx_object] works the way I want it. Suggestions are welcome but I am happy with the quick solutio

Re: [Numpy-discussion] Multiply along axis

2011-06-29 Thread Skipper Seabold
On Wed, Jun 29, 2011 at 10:32 AM, Robert Elsner wrote: > > Hello everyone, > > I would like to solve the following problem (preferably without > reshaping / flipping the array a). > > Assume I have a vector v of length x and an n-dimensional array a where > one dimension has length x as well. Now

Re: [Numpy-discussion] Multiply along axis

2011-06-29 Thread Robert Elsner
Oh and I forgot to mention: I want to specify the axis so that it is possible to multiply x along an arbitrary axis of a (given that the lengths match). On 29.06.2011 16:32, Robert Elsner wrote: > Hello everyone, > > I would like to solve the following problem (preferably without > reshaping / fli

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/29/2011 03:45 PM, Matthew Brett wrote: > Hi, > > On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: >> On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: >>> ... (You might think, what difference does it

[Numpy-discussion] Multiply along axis

2011-06-29 Thread Robert Elsner
Hello everyone, I would like to solve the following problem (preferably without reshaping / flipping the array a). Assume I have a vector v of length x and an n-dimensional array a where one dimension has length x as well. Now I would like to multiply the vector v along a given axis of a. Some

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Matthew Brett
Hi, On Wed, Jun 29, 2011 at 12:39 AM, Mark Wiebe wrote: > On Tue, Jun 28, 2011 at 5:20 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Tue, Jun 28, 2011 at 4:06 PM, Nathaniel Smith wrote: >> ... >> > (You might think, what difference does it make if you *can* unmask an >> > item? Us missing data

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Lluís
Matthew Brett writes: >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys >> the idea that the entry is still there, but we're just ignoring it.  Of >> course, that goes against common convention, but it might be easier to >> explain. > I think Nathaniel's point is that np.IG

Re: [Numpy-discussion] missing data discussion round 2

2011-06-29 Thread Dag Sverre Seljebotn
On 06/27/2011 05:55 PM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion > has helped clarify the ideas for me. I've renamed and updated the NEP, > then placed it into the master NumPy