Re: [Numpy-discussion] missing data discussion round 2

Pierre GM Tue, 28 Jun 2011 13:45:51 -0700

All,
I'm not sure I understand some aspects of Mark's new proposal, sorry (blame the 
lack of sleep).
I'm pretty excited with the idea of built-in NA like np.dtype(NA['float64']), 
provided we can come with some shortcuts like np.nafloat64. I think that would 
really take care of the missing data part in a consistent and non-ambiguous 
way. 
However, I understand that if a choice would be made, this approach would be 
dropped for the most generic "mask way", right ? (By "mask way", I mean 
something that is close (but actually optimized) to thenumpy.ma approach).


So, taking this example
>>> np.add(a, b, out=b, mask=(a > threshold))
If 'b' doesn't already have a mask, masked values will be lost if we go the 
mask way ? But kept if we go the bit way ? I prefer the latter, then
Another advantage I see in the "bit-way' is that it's pretty close to the 
'hardmask' idea. You'll never risk to lose the mask as it's already "burned" in 
the array...


And now for something not that completely different:
* Would it be possible to store internally the addresses of the NAs only to 
save some space (in the metadata ?) and when the .mask or .valid property is 
called, to still get a boolean array with the same shape as the underlying 
array ?
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] missing data discussion round 2

Reply via email to