On Thu, Aug 18, 2011 at 2:43 PM, Mark Wiebe <mwwi...@gmail.com> wrote:
> It's taken a lot of changes to get the NA mask support to its current > point, but the code ready for some testing now. You can read the > work-in-progress release notes here: > > > https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst > > To try it out, check out the missingdata branch from my github account, > here, and build in the standard way: > > https://github.com/m-paradox/numpy > > The things most important to test are: > > * Confirm that existing code still works correctly. I've tested against > SciPy and matplotlib. > * Confirm that the performance of code not using NA masks is the same or > better. > * Try to do computations with the NA values, find places they don't work > yet, and nominate unimplemented functionality important to you to be next on > the development list. The release notes have a preliminary list of > implemented/unimplemented functions. > * Report any crashes, build problems, or unexpected behaviors. > > In addition to adding the NA mask, I've also added features and done a few > performance changes here and there, like letting reductions like sum take > lists of axes instead of being a single axis or all of them. These changes > affect various bugs like http://projects.scipy.org/numpy/ticket/1143 and > http://projects.scipy.org/numpy/ticket/533. > With a new fix to the unitless reduction logic I just committed, the situation for bug http://projects.scipy.org/numpy/ticket/450 is also improved. Cheers, Mark > Thanks! > Mark > > Here's a small example run using NAs: > > >>> import numpy as np > >>> np.__version__ > '2.0.0.dev-8a5e2a1' > >>> a = np.random.rand(3,3,3) > >>> a.flags.maskna = True > >>> a[np.random.rand(3,3,3) < 0.5] = np.NA > >>> a > array([[[NA, NA, 0.11511708], > [ 0.46661454, 0.47565512, NA], > [NA, NA, NA]], > > [[NA, 0.57860351, NA], > [NA, NA, 0.72012669], > [ 0.36582123, NA, 0.76289794]], > > [[ 0.65322748, 0.92794386, NA], > [ 0.53745165, 0.97520989, 0.17515083], > [ 0.71219688, 0.5184328 , 0.75802805]]]) > >>> np.mean(a, axis=-1) > array([[NA, NA, NA], > [NA, NA, NA], > [NA, 0.56260412, 0.66288591]]) > >>> np.std(a, axis=-1) > array([[NA, NA, NA], > [NA, NA, NA], > [NA, 0.32710662, 0.10384331]]) > >>> np.mean(a, axis=-1, skipna=True) > /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474: > RuntimeWarning: invalid value encountered in true_divide > um.true_divide(ret, rcount, out=ret, casting='unsafe') > array([[ 0.11511708, 0.47113483, nan], > [ 0.57860351, 0.72012669, 0.56435958], > [ 0.79058567, 0.56260412, 0.66288591]]) > >>> np.std(a, axis=-1, skipna=True) > /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707: > RuntimeWarning: invalid value encountered in true_divide > um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe') > /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730: > RuntimeWarning: invalid value encountered in true_divide > um.true_divide(ret, rcount, out=ret, casting='unsafe') > array([[ 0. , 0.00452029, nan], > [ 0. , 0. , 0.19853835], > [ 0.13735819, 0.32710662, 0.10384331]]) > >>> np.std(a, axis=(1,2), skipna=True) > array([ 0.16786895, 0.15498008, 0.23811937]) > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion