Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Matthew Brett Fri, 24 Jun 2011 17:02:16 -0700

Hi,

On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney <[email protected]> wrote:
...
> Perhaps we should make a wiki page someplace summarizing pros and cons
> of the various implementation approaches?


But - we should do this if it really is an open question which one we
go for.   If not then, we're just slowing Mark down in getting to the
implementation.

Assuming the question is still open, here's a starter for the pros and cons:

array.mask
1) It's easier / neater to implement
2) It can generalize across dtypes
3) You can still get the masked data underneath the mask (allowing you
to unmask etc)

nafloat64:
1) No memory overhead
2) Battle-tested implementation already done in R

I guess we'd have to test directly whether the non-continuous memory
of the mask and data would cause enough cache-miss problems to
outweigh the potential cycle-savings from single byte comparisons in
array.mask.

I guess that one and only one of these will get written.  I guess that
one of these choices may be a lot more satisfying to the current and
future masked array itch than the other.

I'm personally worried that the memory overhead of array.masks will
make many of us tend to avoid them.  I work with images that can
easily get large enough that I would not want an array-items size byte
array added to my storage.

The reason I'm asking for more details about the implementation is
because that is most of the argument for array.mask at the moment (1
and 2 above).

See you,

Matthew
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Reply via email to