Re: [Numpy-discussion] Concepts for masked/missing data

Benjamin Root Sat, 25 Jun 2011 12:10:06 -0700

On Sat, Jun 25, 2011 at 1:57 PM, Nathaniel Smith <n...@pobox.com> wrote:


> On Sat, Jun 25, 2011 at 11:50 AM, Eric Firing <efir...@hawaii.edu> wrote:
> > On 06/25/2011 07:05 AM, Nathaniel Smith wrote:
> >> On Sat, Jun 25, 2011 at 9:26 AM, Matthew Brett<matthew.br...@gmail.com>
>  wrote:
> >>> To clarify, you're proposing for:
> >>>
> >>> a = np.sum(np.array([np.NA, np.NA])
> >>>
> >>> 1) ->  np.NA
> >>> 2) ->  0.0
> >>
> >> Yes -- and in R you get actually do get NA, while in numpy.ma you
> >> actually do get 0. I don't think this is a coincidence; I think it's
> >
> > No, you don't:
> >
> > In [2]: np.ma.array([2, 4], mask=[True, True]).sum()
> > Out[2]: masked
> >
> > In [4]: np.sum(np.ma.array([2, 4], mask=[True, True]))
> > Out[4]: masked
>
> Huh. So in numpy.ma, sum([10, NA]) and sum([10]) are the same, but
> sum([NA]) and sum([]) are different? Sounds to me like you should file
> a bug on numpy.ma...
>

Actually, no... I should have tested this before replying earlier:

>>> a = np.ma.array([2, 4], mask=[True, True])
>>> a
masked_array(data = [-- --],
             mask = [ True  True],
       fill_value = 999999)

>>> a.sum()
masked
>>> a = np.ma.array([], mask=[])
>>> a
>>> a
masked_array(data = [],
             mask = [],
       fill_value = 1e+20)
>>> a.sum()
masked

They are the same.


> Anyway, the general point is that in R, NA's propagate, and in
> numpy.ma, masked values are ignored (except, apparently, if all values
> are masked). Here, I actually checked these:
>
> Python: np.ma.array([2, 4], mask=[True, False]).sum() -> 4
> R: sum(c(NA, 4)) -> NA
>
>
If you want NaN behavior, then use NaNs.  If you want masked behavior, then
use masks.

Ben Root

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Concepts for masked/missing data

Reply via email to