Re: [Numpy-discussion] NA/Missing Data Conference Call Summary

Bruce Southey Wed, 06 Jul 2011 13:11:46 -0700

On 07/06/2011 02:38 PM, Christopher Jordan-Squire wrote:

On Wed, Jul 6, 2011 at 11:38 AM, Christopher Barker<[email protected] <mailto:[email protected]>> wrote:


    Christopher Jordan-Squire wrote:
    > If we follow those rules for IGNORE for all computations, we
    sometimes
    > get some weird output. For example:
    > [ [1, 2], [3, 4] ] * [ IGNORE, 7] = [ 15, 31 ]. (Where * is matrix
    > multiply and not * with broadcasting.) Or should that sort of
    operation
    > through an error?

    That should throw an error -- matrix computation is heavily influenced
    by the shape and size of matrices, so I think IGNORES really don't
    make
    sense there.

If the IGNORES don't make sense in basic numpy computations then I'mkinda confused why they'd be included at the numpy core level.


    Nathaniel Smith wrote:
    > It's exactly this transparency that worries Matthew and me -- we
    feel
    > that the alterNEP preserves it, and the NEP attempts to erase it. In
    > the NEP, there are two totally different underlying data structures,
    > but this difference is blurred at the Python level. The idea is that
    > you shouldn't have to think about which you have, but if you
    work with
    > C/Fortran, then of course you do have to be constantly aware of the
    > underlying implementation anyway.

    I don't think this bothers me -- I think it's analogous to things in
    numpy like Fortran order and non-contiguous arrays -- you can
    ignore all
    that when working in pure python when performance isn't critical, but
    you need a deeper understanding if you want to work with the data in C
    or Fortran or to tune performance in python.

    So as long as there is an API to query and control how things work, I
    like that it's hidden from simple python code.

    -Chris

I'm similarly not too concerned about it. Performance seems finickywhen you're dealing with missing data, since a lot of arrays willlikely have to be copied over to other arrays containing only completedata before being handed over to BLAS. My primary concern is that thenp.NA stuff 'just works'. Especially since I've never run into usecases in statistics where the difference between IGNORE and NA mattered.

Exactly!

I have not been able to think of an real example where that differencematters as the calculations are only on the 'valid' (ie non-missing andnon-masked) values.


Bruce

_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NA/Missing Data Conference Call Summary

Reply via email to