On Wed, Jun 29, 2011 at 1:51 PM, Lluís wrote:
> Mark Wiebe writes:
> [...]
> > I think that deciding on the value of NA signal values boils down to
> > this question: should 3rd party code be able to interpret missing
> data
> > information stored in the separate mask array?
>
> > I'm
Mark Wiebe writes:
[...]
> I think that deciding on the value of NA signal values boils down to
> this question: should 3rd party code be able to interpret missing data
> information stored in the separate mask array?
> I'm tossing around some variations of ideas using the iterator
On Wed, Jun 29, 2011 at 11:53 AM, Mark Wiebe wrote:
> On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote:
>
>> Mark Wiebe writes:
>> > The design that's forming is a combination of:
>>
>> > * Solve the missing data problem
>> > * My ideas of what a good solution looks like:
>> >* applies to all Nu
On Tue, Jun 28, 2011 at 7:34 AM, Lluís wrote:
> Mark Wiebe writes:
> > The design that's forming is a combination of:
>
> > * Solve the missing data problem
> > * My ideas of what a good solution looks like:
> >* applies to all NumPy dtypes in a fully general way
> >* high-performance, lo
Charles R Harris writes:
> I think we may need some standard format for masked data on disk if we
> don't go the NA value route.
As I see it, the mask array is just some metadata that is attached to
the dtype descriptor. I don't know how an ndarray is (un)pickled from
disk, but I imagine that eac
Mark Wiebe writes:
> The design that's forming is a combination of:
> * Solve the missing data problem
> * My ideas of what a good solution looks like:
> * applies to all NumPy dtypes in a fully general way
> * high-performance, low overhead where possible
> * makes the C-level implement
On Sat, Jun 25, 2011 at 3:25 PM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Sat, Jun 25, 2011 at 03:16:39PM -0500, Mark Wiebe wrote:
> >This is why I'm also proposing to add a 'mask=' parameter to ufuncs,
> for
> >example, to expose the implementation details of the masked
On Sat, Jun 25, 2011 at 9:44 AM, Wes McKinney wrote:
> On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris
> wrote:
> > On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney
> wrote:
> >>
> >> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
> >> wrote:
> >> >
> >> >
> >> > On Fri, Jun 24, 2011 at 1
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris wrote:
> On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote:
>
>> This thread is getting quite long, innit ?
>> And I think it's getting a tad confusing, because we're mixing two
>> different concepts: missing values and masks.
>> There should be s
On Sat, Jun 25, 2011 at 03:16:39PM -0500, Mark Wiebe wrote:
>This is why I'm also proposing to add a 'mask=' parameter to ufuncs, for
>example, to expose the implementation details of the masked array system
>to people who need masks but need them to be a bit different. There may be
>
On Sat, Jun 25, 2011 at 7:00 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
> > I'm personally worried that the memory overhead of array.masks will
> > make many of us tend to avoid them. I work with images that can
> >
On Sat, Jun 25, 2011 at 9:14 AM, Wes McKinney wrote:
> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney
> wrote:
> >>
> >> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith
> wrote:
> >> > On Fri, Jun 24, 2011 at 6:57 PM, Be
On Sat, Jun 25, 2011 at 6:29 AM, Pierre GM wrote:
> This thread is getting quite long, innit ?
>
It's tiring, yeah!
> And I think it's getting a tad confusing, because we're mixing two
> different concepts: missing values and masks.
> There should be support for missing values in numpy.core, I
On Sat, Jun 25, 2011 at 6:17 AM, Matthew Brett wrote:
> Hi,
>
> On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe wrote:
> > On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett
> > wrote:
> >>
> >> Hi,
> >>
> >> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney
> >> wrote:
> >> ...
> >> > Perhaps we should m
On Sat, Jun 25, 2011 at 6:00 AM, Matthew Brett wrote:
> Hi,
>
> On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe wrote:
> > On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett
> ...
> >> @Mark - I don't have a clear idea whether you consider the nafloat64
> >> option to be still in play as the first thing
On Fri, Jun 24, 2011 at 11:06 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote:
> > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
> >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
> >>> This is a situation where I would just... use an array a
On Fri, Jun 24, 2011 at 10:59 PM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
> > On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
> >> This is a situation where I would just... use an array and a mask,
> >> rather than a masked array. Then lots of thin
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris wrote:
>
> I think he aims to support both. One complication with masks is keeping
> them tied to the data on disk. With na values one file can contain both the
> data and the missing data markers, whereas with masks, two files would be
> required
On Fri, Jun 24, 2011 at 8:25 PM, Benjamin Root wrote:
> On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe wrote:
>
>> On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote:
>>
>>> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
>>> wrote:
>>> >
>>> >
>>> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Br
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote:
> > Another example of how we use masks in matplotlib is in pcolor(). We
> have
> > to combine the possible masks of X, Y, and V in both the x and y
> directions
> > to find the final
Hi,
On Sat, Jun 25, 2011 at 3:27 PM, Charles R Harris
wrote:
>
>
> On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux
> wrote:
>>
>> On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
>> > I'm personally worried that the memory overhead of array.masks will
>> > make many of us tend to a
2011/6/25 Charles R Harris
> I think what we really need to see are the use cases and work flow. The
> ones that hadn't occurred to me before were memory mapped files and data
> stored on disk in general. I think we may need some standard format for
> masked data on disk if we don't go the NA val
Hi,
On Sat, Jun 25, 2011 at 4:05 PM, Charles R Harris
wrote:
>
>
> On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett
> wrote:
>>
>> Hi,
>>
>> On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
>> wrote:
>> >
>> >
>> > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett
>> > wrote:
>> >>
>> >> Hi,
>> >>
Hi,
On Sat, Jun 25, 2011 at 3:44 PM, Wes McKinney wrote:
...
> Here are some things I can think of that would be affected by any changes here
>
> 1) Right now users of pandas can type pandas.isnull(series[5]) and
> that will yield True if the value is NA for any dtype. This might be
> hard to sup
On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett wrote:
> Hi,
>
> On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
> wrote:
> >
> >
> > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett
> > wrote:
> >>
> >> Hi,
> >>
> >> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
> >> wrote:
> >> >
> >> >
>
On Sat, Jun 25, 2011 at 8:44 AM, Wes McKinney wrote:
> On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris
> wrote:
> >
> >
> > On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney
> wrote:
> >>
> >> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
> >> wrote:
> >> >
> >> >
> >> > On Fri, Jun 24, 20
On Sat, Jun 25, 2011 at 08:27:57AM -0600, Charles R Harris wrote:
>Could you expand a bit on what sort of data you have and how you deal with
>it. Where does it come from, how is it stored on disk, what do you do with
>it? That sort of thing.
3D and 4D images. Mostly stored on disk in
Hi,
On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
wrote:
>
>
> On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett
> wrote:
>>
>> Hi,
>>
>> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
>> wrote:
>> >
>> >
>> > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote:
>> >>
>> >> This thread is get
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett wrote:
> Hi,
>
> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
> wrote:
> >
> >
> > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote:
> >>
> >> This thread is getting quite long, innit ?
> >> And I think it's getting a tad confusing, because we'
On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris
wrote:
>
>
> On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wrote:
>>
>> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
>> wrote:
>> >
>> >
>> > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney
>> > wrote:
>> >>
>> >> On Fri, Jun 24, 2011 at 1
Hi,
On Sat, Jun 25, 2011 at 3:14 PM, Wes McKinney wrote:
...
> I hope you're right. So far it seems that anyone who has spent real
> time with R (e.g. myself, Nathaniel) has expressed serious concerns
> about the masked approach.
I'm sorry - I have been distracted. For my sake, and because this
Hi,
On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
wrote:
>
>
> On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote:
>>
>> This thread is getting quite long, innit ?
>> And I think it's getting a tad confusing, because we're mixing two
>> different concepts: missing values and masks.
>> There sh
On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
> > I'm personally worried that the memory overhead of array.masks will
> > make many of us tend to avoid them. I work with images that can
> >
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wrote:
> On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney
> wrote:
> >>
> >> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith
> wrote:
> >> > On Fri, Jun 24, 2011 at 6:57 PM, Be
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM wrote:
> This thread is getting quite long, innit ?
> And I think it's getting a tad confusing, because we're mixing two
> different concepts: missing values and masks.
> There should be support for missing values in numpy.core, I think we all
> agree on
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
wrote:
>
>
> On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wrote:
>>
>> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote:
>> > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
>> >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith
On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
> I'm personally worried that the memory overhead of array.masks will
> make many of us tend to avoid them. I work with images that can
> easily get large enough that I would not want an array-items size byte
> array added to my storag
This thread is getting quite long, innit ?
And I think it's getting a tad confusing, because we're mixing two different
concepts: missing values and masks.
There should be support for missing values in numpy.core, I think we all agree
on that.
* What's been suggested of adding new dtypes (naflo
Hi,
On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe wrote:
> On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett
> wrote:
>>
>> Hi,
>>
>> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney
>> wrote:
>> ...
>> > Perhaps we should make a wiki page someplace summarizing pros and cons
>> > of the various implem
Hi,
On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe wrote:
> On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett
...
>> @Mark - I don't have a clear idea whether you consider the nafloat64
>> option to be still in play as the first thing to be implemented
>> (before array.mask). If it is, what kind of
On 2011-06-24 17:30, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote:
>> > On 2011-06-24 16:43, Robert Kern wrote:
>>> >>
>>> >> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
>>> >>wrote:
>>>
> >>> >
> >>> > ?On Fri, Jun 24, 2011 at 8:06 AM, Robe
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote:
> > On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
> >> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
> >>> This is a situation where I would just... use an array a
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
>> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
>>> This is a situation where I would just... use an array and a mask,
>>> rather than a masked array. Then lots of things -
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root wrote:
> On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
>> This is a situation where I would just... use an array and a mask,
>> rather than a masked array. Then lots of things -- changing fill
>> values, temporarily masking/unmasking things
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote:
> > Another example of how we use masks in matplotlib is in pcolor(). We
> have
> > to combine the possible masks of X, Y, and V in both the x and y
> directions
> > to find the final
On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe wrote:
> On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote:
>
>> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
>> wrote:
>> >
>> >
>> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett > >
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Jun 24, 2
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root wrote:
> Another example of how we use masks in matplotlib is in pcolor(). We have
> to combine the possible masks of X, Y, and V in both the x and y directions
> to find the final mask to use for the final output result (because each
> facet needs v
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett wrote:
> Hi,
>
> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney
> wrote:
> ...
> > Perhaps we should make a wiki page someplace summarizing pros and cons
> > of the various implementation approaches?
>
> But - we should do this if it really is an ope
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
> > wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root
> wrote:
> >> ...
> >> > Again, there
On Fri, Jun 24, 2011 at 6:10 PM, Charles R Harris wrote:
>
>
> On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett wrote:
>
>> Hi,
>>
>> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote:
>> ...
>> > Again, there are pros and cons either way and I see them very orthogonal
>> and
>> > complementar
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett wrote:
> Hi,
>
> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote:
> ...
> > Again, there are pros and cons either way and I see them very orthogonal
> and
> > complementary.
>
> That may be true, but I imagine only one of them will be implement
On Fri, Jun 24, 2011 at 4:24 PM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe wrote:
> > For the maybe dtype, it would need to gain access to the ufunc loop of
> the
> > underlying dtype, and call it appropriately during the inner loop. This
> > appears to require some m
On Fri, Jun 24, 2011 at 4:09 PM, Benjamin Root wrote:
>
>
> On Fri, Jun 24, 2011 at 10:40 AM, Mark Wiebe wrote:
>
>> On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote:
>>
>>> On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote:
>>>
Sorry y'all, I'm just commenting bits by bits:
On Fri, Jun 24, 2011 at 6:11 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris
> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney
> wrote:
> >>
> >> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
> >> wrote:
> >> >
> >> >
> >> > On Fri, Jun 24, 2011
On Fri, Jun 24, 2011 at 3:38 PM, Lluís wrote:
> Mark Wiebe writes:
>
> > It's should also be possible to accomplish a general
> > solution at the dtype level. We could have a 'dtype
> > factory' used like: np.zeros(10, dtype=np.maybe(float))
> > wh
On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris
wrote:
>
>
> On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wrote:
>>
>> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
>> wrote:
>> >
>> >
>> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Jun
On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
> wrote:
> >
> >
> > On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
> > wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root
> wrote:
> >> ...
> >> > Again, there
Hi,
On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wrote:
...
> Perhaps we should make a wiki page someplace summarizing pros and cons
> of the various implementation approaches?
But - we should do this if it really is an open question which one we
go for. If not then, we're just slowing Mark
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
wrote:
>
>
> On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
> wrote:
>>
>> Hi,
>>
>> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote:
>> ...
>> > Again, there are pros and cons either way and I see them very orthogonal
>> > and
>> > complem
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett wrote:
> Hi,
>
> On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote:
> ...
> > Again, there are pros and cons either way and I see them very orthogonal
> and
> > complementary.
>
> That may be true, but I imagine only one of them will be implement
On Thu, Jun 23, 2011 at 07:51:25PM -0400, josef.p...@gmail.com wrote:
> From the perspective of statistical analysis, I don't see much
> advantage of this. What to do with nans depends on the analysis, and
> needs to be looked at for each case.
>From someone who actually sometimes does statistics
Hi,
On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root wrote:
...
> Again, there are pros and cons either way and I see them very orthogonal and
> complementary.
That may be true, but I imagine only one of them will be implemented.
@Mark - I don't have a clear idea whether you consider the nafloat
On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe wrote:
> For the maybe dtype, it would need to gain access to the ufunc loop of the
> underlying dtype, and call it appropriately during the inner loop. This
> appears to require some more invasive upheaval within the ufunc code than
> the masking appro
On Fri, Jun 24, 2011 at 10:40 AM, Mark Wiebe wrote:
> On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote:
>
>> On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote:
>>
>>> Sorry y'all, I'm just commenting bits by bits:
>>>
>>> "One key problem is a lack of orthogonality with other features, for
>
Mark Wiebe writes:
> It's should also be possible to accomplish a general
> solution at the dtype level. We could have a 'dtype
> factory' used like: np.zeros(10, dtype=np.maybe(float))
> where np.maybe(x) returns a new dtype whose storage size
>
On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett wrote:
> Hi,
>
> On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe wrote:
> > On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett
> > wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote:
> ...
> >> and the fact that 'missing_
On Fri, Jun 24, 2011 at 1:04 PM, Matthew Brett wrote:
> Hi,
>
> Just as a use case, if I do this:
>
> a = np.zeros((big_number,), dtype=np.int32)
> a[0,0] = np.NA
>
> I think I'm right in saying that, with the array.mask implementation
> my array memory usage with grow by new big_number bytes, whe
On Fri, Jun 24, 2011 at 12:06 PM, Wes McKinney wrote:
> On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe wrote:
> > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote:
> >>
> >> On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote:
> >> > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith
> wrote:
On Fri, Jun 24, 2011 at 11:54 AM, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe wrote:
> > On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote:
> >> But on the other hand, we gain:
> >> -- simpler implementation: no need to be checking and tracking the
> >> mask buffe
Hi,
On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe wrote:
> On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett
> wrote:
>>
>> Hi,
>>
>> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote:
...
>> and the fact that 'missing_value' could be any type would make the
>> code more complicated than the cu
On Fri, Jun 24, 2011 at 11:25 AM, Christopher Barker
wrote:
> Robert Kern wrote:
>
> > It's worth noting that this is not a replacement for masked arrays,
> > nor is it intended to be the be-all, end-all solution to missing data
> > problems. It's mostly just intended to be a focused tool to fill
On Fri, Jun 24, 2011 at 11:25 AM, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 11:13, Christopher Barker
> wrote:
> > Nathaniel Smith wrote:
> >
> >> If we think that the memory overhead for floating point types is too
> >> high, it would be easy to add a special case where maybe(float) used a
>
On Fri, Jun 24, 2011 at 11:13 AM, Christopher Barker
wrote:
> Nathaniel Smith wrote:
> >> The 'dtype factory' idea builds on the way I've structured datetime as a
> >> parameterized type,
>
> ...
>
> Another disadvantage is that we get further from Gael Varoquaux's point:
> >> Right now, the nump
Hi,
Just as a use case, if I do this:
a = np.zeros((big_number,), dtype=np.int32)
a[0,0] = np.NA
I think I'm right in saying that, with the array.mask implementation
my array memory usage with grow by new big_number bytes, whereas with
the np.naint32 implementation you'd get something like:
Err
On Fri, Jun 24, 2011 at 10:07 AM, Matthew Brett wrote:
> Hi,
>
> On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern
> wrote:
> > On Fri, Jun 24, 2011 at 09:33, Charles R Harris
> > wrote:
> >>
> >> On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern
> wrote:
> >
> >>> The alternative proposal would be to ad
On Fri, Jun 24, 2011 at 10:02 AM, Pierre GM wrote:
> On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
>
> > On Fri, Jun 24, 2011 at 09:35, Robert Kern
> wrote:
> >> On Fri, Jun 24, 2011 at 09:24, Keith Goodman
> wrote:
> >>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern
> wrote:
> >>>
> The
On Fri, Jun 24, 2011 at 9:27 AM, Bruce Southey wrote:
> **
> On 06/24/2011 09:06 AM, Robert Kern wrote:
>
> On Fri, Jun 24, 2011 at 07:30, Laurent Gautier
> wrote:
>
> On 2011-06-24 13:59, Nathaniel Smith
> wrote:
>
> On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Root
> wrote:
>
> Lastly
On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe wrote:
> On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote:
>>
>> On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote:
>> > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith wrote:
>> >> It's should also be possible to accomplish a general solution
On Fri, Jun 24, 2011 at 8:57 AM, Keith Goodman wrote:
> On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe wrote:
> > On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman
> wrote:
> >>
> >> On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe wrote:
> >> > Enthought has asked me to look into the "missing data" prob
On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe wrote:
> On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote:
>> But on the other hand, we gain:
>> -- simpler implementation: no need to be checking and tracking the
>> mask buffer everywhere. The needed infrastructure is already built in.
>
> I do
On Fri, Jun 24, 2011 at 8:01 AM, Neal Becker wrote:
> Just 1 question before I look more closely. What is the cost to the non-MA
> user
> of this addition?
>
I'm following the idea that you don't pay for what you don't use. All the
existing stuff will perform the same.
-Mark
> __
On Fri, Jun 24, 2011 at 7:30 AM, Laurent Gautier wrote:
> On 2011-06-24 13:59, Nathaniel Smith wrote:
> > On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Root wrote:
> >> Lastly, I am not entirely familiar with R, so I am also very curious
> about
> >> what this magical "NA" value is, and how it com
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett wrote:
> Hi,
>
> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith wrote:
> ...
> > If we think that the memory overhead for floating point types is too
> > high, it would be easy to add a special case where maybe(float) used a
> > distinguished NaN i
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith wrote:
> On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe wrote:
> > On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith wrote:
> >> It's should also be possible to accomplish a general solution at the
> >> dtype level. We could have a 'dtype factory' us
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote:
> The alternative proposal would be to add a few new dtypes that are
> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
> (there are lots of different NaN bit patterns, we'd just reserve one)
> that would represent NA. An naint32
Robert Kern wrote:
> It's worth noting that this is not a replacement for masked arrays,
> nor is it intended to be the be-all, end-all solution to missing data
> problems. It's mostly just intended to be a focused tool to fill in
> the gaps where masked arrays are less convenient for whatever rea
On Fri, Jun 24, 2011 at 11:13, Christopher Barker wrote:
> Nathaniel Smith wrote:
>
>> If we think that the memory overhead for floating point types is too
>> high, it would be easy to add a special case where maybe(float) used a
>> distinguished NaN instead of a separate boolean.
>
> That would
On Thu, Jun 23, 2011 at 8:00 PM, Pierre GM wrote:
>
> On Jun 24, 2011, at 2:42 AM, Mark Wiebe wrote:
>
> > On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote:
> > Sorry y'all, I'm just commenting bits by bits:
> >
> > "One key problem is a lack of orthogonality with other features, for
> instance
Nathaniel Smith wrote:
>> The 'dtype factory' idea builds on the way I've structured datetime as a
>> parameterized type,
...
Another disadvantage is that we get further from Gael Varoquaux's point:
>> Right now, the numpy array can be seen as an extension of the C
>> array, basically a pointer
On Fri, Jun 24, 2011 at 8:30 AM, Robert Kern wrote:
> I would suggest following R's lead and letting ((NA==NA) == True)
> unlike NaNs.
In R, NA and NaN do behave differently with respect to ==, but not the
way you're saying:
> NA == NA
[1] NA
> if (NA == NA) 1;
Error in if (NA == NA) 1 : missing
On Fri, Jun 24, 2011 at 11:05, Nathaniel Smith wrote:
> On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern wrote:
>> On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote:
>>> May be there is not so much need for reservation over the string NA, when
>>> making the distinction between:
>>> a- the intern
On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote:
>> May be there is not so much need for reservation over the string NA, when
>> making the distinction between:
>> a- the internal representation of a "missing string" (what is stored in
>>
On Thu, Jun 23, 2011 at 7:56 PM, Benjamin Root wrote:
> On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM wrote:
>
>> Sorry y'all, I'm just commenting bits by bits:
>>
>> "One key problem is a lack of orthogonality with other features, for
>> instance creating a masked array with physical quantities ca
On Fri, Jun 24, 2011 at 10:02, Pierre GM wrote:
>
> On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
>
>> On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote:
>>> On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote:
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote:
> The alternative
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier wrote:
> On 2011-06-24 16:43, Robert Kern wrote:
>>
>> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
>> wrote:
>>>
>>> >
>>> > On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern
>>> > wrote:
>> The alternative proposal would be to add a few
Hi,
On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
> wrote:
>>
>> On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern wrote:
>
>>> The alternative proposal would be to add a few new dtypes that are
>>> NA-aware. E.g. an nafloat64 would reserve a p
On 2011-06-24 16:43, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 09:33, Charles R Harris
> wrote:
>> >
>> > On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern
>> > wrote:
>>> >> The alternative proposal would be to add a few new dtypes that are
>>> >> NA-aware. E.g. an nafloat64 would reserve a
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote:
>> On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote:
>>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote:
>>>
The alternative proposal would be to add a few new dtypes that are
N
On Fri, Jun 24, 2011 at 8:44 AM, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote:
> > On Fri, Jun 24, 2011 at 09:24, Keith Goodman
> wrote:
> >> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern
> wrote:
> >>
> >>> The alternative proposal would be to add a few new dtypes that
On Fri, Jun 24, 2011 at 09:35, Robert Kern wrote:
> On Fri, Jun 24, 2011 at 09:24, Keith Goodman wrote:
>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote:
>>
>>> The alternative proposal would be to add a few new dtypes that are
>>> NA-aware. E.g. an nafloat64 would reserve a particular NaN
1 - 100 of 159 matches
Mail list logo