On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris <[email protected]> wrote: > Let me try to summarize. To begin with, the environment of the nan functions > is rather special. > > 1) if the array is of not of inexact type, they punt to the non-nan > versions. > 2) if the array is of inexact type, then out and dtype must be inexact if > specified > > The second assumption guarantees that NaN can be used in the return values.
The requirement on the 'out' dtype only exists because currently the nan function like to return nan for things like empty arrays, right? If not for that, it could be relaxed? (it's a rather weird requirement, since the whole point of these functions is that they ignore nans, yet they don't always...) > sum and nansum > > These should be consistent so that empty sums are 0. This should cover the > empty array case, but will change the behaviour of nansum which currently > returns NaN if the array isn't empty but the slice is after NaN removal. I agree that returning 0 is the right behaviour, but we might need a FutureWarning period. > mean and nanmean > > In the case of empty arrays, an empty slice, this leads to 0/0. For Python > this is always a zero division error, for Numpy this raises a warning and > and returns NaN for floats, 0 for integers. > > Currently mean returns NaN and raises a RuntimeWarning when 0/0 occurs. In > the special case where dtype=int, the NaN is cast to integer. > > Option1 > 1) mean raise error on 0/0 > 2) nanmean no warning, return NaN > > Option2 > 1) mean raise warning, return NaN (current behavior) > 2) nanmean no warning, return NaN > > Option3 > 1) mean raise warning, return NaN (current behavior) > 2) nanmean raise warning, return NaN I have mixed feelings about the whole np.seterr apparatus, but since it exists, shouldn't we use it for consistency? I.e., just do whatever numpy is set up to do with 0/0? (Which I think means, warn and return NaN by default, but this can be changed.) > var, std, nanvar, nanstd > > 1) if ddof > axis(axes) size, raise error, probably a program bug. > 2) If ddof=0, then whatever is the case for mean, nanmean > > For nanvar, nanstd it is possible that some slice are good, some bad, so > > option1 > 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice > > option2 > 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice I don't really have any intuition for these ddof cases. Just raising an error on negative effective dof is pretty defensible and might be the safest -- it's a easy to turn an error into something sensible later if people come up with use cases... -n _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
