On Thu, Jun 23, 2011 at 4:46 PM, Charles R Harris <[email protected] > wrote:
> On Thu, Jun 23, 2011 at 2:53 PM, Mark Wiebe <[email protected]> wrote: > >> Enthought has asked me to look into the "missing data" problem and how >> NumPy could treat it better. I've considered the different ideas of adding >> dtype variants with a special signal value and masked arrays, and concluded >> that adding masks to the core ndarray appears is the best way to deal with >> the problem in general. >> >> I've written a NEP that proposes a particular design, viewable here: >> >> >> https://github.com/m-paradox/numpy/blob/cmaskedarray/doc/neps/c-masked-array.rst >> >> There are some questions at the bottom of the NEP which definitely need >> discussion to find the best design choices. Please read, and let me know of >> all the errors and gaps you find in the document. >> >> > I agree that low level support for masks is the way to go. > > > If all the input values are masked, 'sum' and 'prod' will produce the > additive and multiplicative identities respectively > > A masked zero dimensional array might be another option, depending on how > you handle scalars. This would also work when arrays were summed down an > axis if a masked array was returned. > I think there has to be a difference like with "sum" and "nansum". Maybe control over this would be a parameter to the sum function, indicating how to interpret masked values. > I suppose the problem with using the word 'mask' is the implication that it > hides something. Maybe 'window' would be an alternate choice, although in > this context I tend to think of 'mask' as having the meaning you assign to > it. > Some copy/paste from the NEP: There is some consternation about the conventional True/False interpretation of the mask, centered around the name "mask". One possibility to deal with this is to call it a "validity mask" in all documentation, which more clearly indicates that True means valid data. If this isn't sufficient, an alternate name for the attribute could be found, like "a.validitymask", "a.validmask", or "a.validity". -Mark > Chuck > > >> Thanks, >> Mark >> >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
