On 07/06/2011 03:37 PM, Pierre GM wrote: > On Jul 6, 2011, at 10:11 PM, Bruce Southey wrote: > >> On 07/06/2011 02:38 PM, Christopher Jordan-Squire wrote: >>> >>> On Wed, Jul 6, 2011 at 11:38 AM, Christopher Barker<[email protected]> >>> wrote: >>> Christopher Jordan-Squire wrote: >>>> If we follow those rules for IGNORE for all computations, we sometimes >>>> get some weird output. For example: >>>> [ [1, 2], [3, 4] ] * [ IGNORE, 7] = [ 15, 31 ]. (Where * is matrix >>>> multiply and not * with broadcasting.) Or should that sort of operation >>>> through an error? >>> That should throw an error -- matrix computation is heavily influenced >>> by the shape and size of matrices, so I think IGNORES really don't make >>> sense there. >>> >>> >>> >>> If the IGNORES don't make sense in basic numpy computations then I'm kinda >>> confused why they'd be included at the numpy core level. >>> >>> >>> Nathaniel Smith wrote: >>>> It's exactly this transparency that worries Matthew and me -- we feel >>>> that the alterNEP preserves it, and the NEP attempts to erase it. In >>>> the NEP, there are two totally different underlying data structures, >>>> but this difference is blurred at the Python level. The idea is that >>>> you shouldn't have to think about which you have, but if you work with >>>> C/Fortran, then of course you do have to be constantly aware of the >>>> underlying implementation anyway. >>> I don't think this bothers me -- I think it's analogous to things in >>> numpy like Fortran order and non-contiguous arrays -- you can ignore all >>> that when working in pure python when performance isn't critical, but >>> you need a deeper understanding if you want to work with the data in C >>> or Fortran or to tune performance in python. >>> >>> So as long as there is an API to query and control how things work, I >>> like that it's hidden from simple python code. >>> >>> -Chris >>> >>> >>> >>> I'm similarly not too concerned about it. Performance seems finicky when >>> you're dealing with missing data, since a lot of arrays will likely have to >>> be copied over to other arrays containing only complete data before being >>> handed over to BLAS. My primary concern is that the np.NA stuff 'just >>> works'. Especially since I've never run into use cases in statistics where >>> the difference between IGNORE and NA mattered. >>> >>> >> Exactly! >> I have not been able to think of an real example where that difference >> matters as the calculations are only on the 'valid' (ie non-missing and >> non-masked) values. > In practice, they could be treated the same way (ie, skipped). However, they > are conceptually different and one may wish to keep this difference of > information around (between NAs you didn't have and IGNOREs you just dropped > temporarily. > > > _______________________________________________ I have yet to see these as *conceptually different* in any of the arguments given.
Separate NAs or IGNORES or any number of missing value codes just requires use to avoid 'unmasking' those missing value codes in your array as, I presume like masked arrays, you need some placeholder values. Bruce _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
