On Mon, Jul 15, 2013 at 8:58 AM, Charles R Harris <[email protected] > wrote:
> > > On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg < > [email protected]> wrote: > >> On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote: >> > >> > >> > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris >> > <[email protected]> wrote: >> > >> >> <snip> >> >> > >> > For nansum, I would expect 0 even in the case of all >> > nans. The point >> > of these functions is to simply ignore nans, correct? >> > So I would aim >> > for this behaviour: nanfunc(x) behaves the same as >> > func(x[~isnan(x)]) >> > >> > >> > Agreed, although that changes current behavior. What about the >> > other cases? >> > >> > >> > >> > Looks like there isn't much interest in the topic, so I'll just go >> > ahead with the following choices: >> > >> > Non-NaN case >> > >> > 1) Empty array -> ValueError >> > >> > The current behavior with stats is an accident, i.e., the nan arises >> > from 0/0. I like to think that in this case the result is any number, >> > rather than not a number, so *the* value is simply not defined. So in >> > this case raise a ValueError for empty array. >> > >> To be honest, I don't mind the current behaviour much sum([]) = 0, >> len([]) = 0, so it is in a way well defined. At least I am not sure if I >> would prefer always an error. I am a bit worried that just changing it >> might break code out there, such as plotting code where it makes >> perfectly sense to plot a NaN (i.e. nothing), but if that is the case it >> would probably be visible fast. >> >> > 2) ddof >= n -> ValueError >> > >> > If the number of elements, n, is not zero and ddof >= n, raise a >> > ValueError for the ddof value. >> > >> Makes sense to me, especially for ddof > n. Just returning nan in all >> cases for backward compatibility would be fine with me too. >> > > Currently if ddof > n it returns a negative number for variance, the NaN > only comes when ddof == 0 and n == 0, leading to 0/0 (float is NaN, integer > is zero division). > > >> >> > Nan case >> > >> > 1) Empty array -> Value Error >> > 2) Empty slice -> NaN >> > 3) For slice ddof >= n -> Nan >> > >> Personally I would somewhat prefer if 1) and 2) would at least default >> to the same thing. But I don't use the nanfuncs anyway. I was wondering >> about adding the option for the user to pick what the fill is (and i.e. >> if it is None (maybe default) -> ValueError). We could also allow this >> for normal reductions without an identity, but I am not sure if it is >> useful there. >> > > In the NaN case some slices may be empty, others not. My reasoning is that > that is going to be data dependent, not operator error, but if the array is > empty the writer of the code should deal with that. > > In the case of the nanvar, nanstd, it might make more sense to handle ddof as 1) if ddof is >= axis size, raise ValueError 2) if ddof is >= number of values after removing NaNs, return NaN The first would be consistent with the non-nan case, the second accounts for the variable nature of data containing NaNs. Chuck
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
