On Wed, Jun 29, 2011 at 1:32 PM, Matthew Brett <matthew.br...@gmail.com>wrote:
> Hi, > > On Wed, Jun 29, 2011 at 6:22 PM, Mark Wiebe <mwwi...@gmail.com> wrote: > > On Wed, Jun 29, 2011 at 8:20 AM, Lluís <xscr...@gmx.net> wrote: > >> > >> Matthew Brett writes: > >> > >> >> Maybe instead of np.NA, we could say np.IGNORE, which sort of conveys > >> >> the idea that the entry is still there, but we're just ignoring it. > Of > >> >> course, that goes against common convention, but it might be easier > to > >> >> explain. > >> > >> > I think Nathaniel's point is that np.IGNORE is a different idea than > >> > np.NA, and that is why joining the implementations can lead to > >> > conceptual confusion. > >> > >> This is how I see it: > >> > >> >>> a = np.array([0, 1, 2], dtype=int) > >> >>> a[0] = np.NA > >> ValueError > >> >>> e = np.array([np.NA, 1, 2], dtype=int) > >> ValueError > >> >>> b = np.array([np.NA, 1, 2], dtype=np.maybe(int)) > >> >>> m = np.array([np.NA, 1, 2], dtype=int, masked=True) > >> >>> bm = np.array([np.NA, 1, 2], dtype=np.maybe(int), masked=True) > >> >>> b[1] = np.NA > >> >>> np.sum(b) > >> np.NA > >> >>> np.sum(b, skipna=True) > >> 2 > >> >>> b.mask > >> None > >> >>> m[1] = np.NA > >> >>> np.sum(m) > >> 2 > >> >>> np.sum(m, skipna=True) > >> 2 > >> >>> m.mask > >> [False, False, True] > >> >>> bm[1] = np.NA > >> >>> np.sum(bm) > >> 2 > >> >>> np.sum(bm, skipna=True) > >> 2 > >> >>> bm.mask > >> [False, False, True] > >> > >> So: > >> > >> * Mask takes precedence over bit pattern on element assignment. There's > >> still the question of how to assign a bit pattern NA when the mask is > >> active. > >> > >> * When using mask, elements are automagically skipped. > >> > >> * "m[1] = np.NA" is equivalent to "m.mask[1] = False" > >> > >> * When using bit pattern + mask, it might make sense to have the initial > >> values as bit-pattern NAs, instead of masked (i.e., "bm.mask == [True, > >> False, True]" and "np.sum(bm) == np.NA") > > > > There seems to be a general idea that masks and NA bit patterns imply > > particular differing semantics, something which I think is simply false. > > Well - first - it's helpful surely to separate the concepts and the > implementation. > > Concepts / use patterns (as delineated by Nathaniel): > A) missing values == 'np.NA' in my emails. Can we call that CMV > (concept missing values)? > B) masks == np.IGNORE in my emails . CMSK (concept masks)? > > Implementations > 1) bit-pattern == na-dtype - how about we call that IBP > (implementation bit patten)? > 2) array.mask. IM (implementation mask)? > > Remember that the masks are invisible, you can't see them, they are an implementation detail. A good reason to hide the implementation is so it can be changed without impacting software that depends on the API. <snip> Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion