To elaborate on that point; knowing that numpy accumulates in a simple
first-to-last sweep, and does not implicitly upcast, the original problem
can be solved in several ways; specifying a higher precision to sum with,
or by a nested summation, like X.mean(0).mean(0)==1.0. I personally like
this explicitness, and am wary of numpy doing overly clever things behind
the scenes, as I can think of other code that might become broken if things
change too radically. For instance, I often sort large arrays with a large
spread in magnitudes before summation, relying on the fact that summing the
smallest values first gives best precision. Any changes made to reduction
behavior should try and be backwards compatible with such properties of
straightforward reductions, or else a lot of code is going to be broken
without warning.

I suppose using maximum precision internally, and nesting all reductions
over multiple axes of an ndarray, are both easy to implement improvements
that do not come with any drawbacks that I can think of. Actually the
maximum precision I am not so sure of, as I personally prefer to make an
informed decision about precision used, and get an error on a platform that
does not support the specified precision, rather than obtain subtly or
horribly broken results without warning when moving your code to a
different platform/compiler whatever.


On Fri, Jul 25, 2014 at 5:37 AM, Eelco Hoogendoorn <
hoogendoorn.ee...@gmail.com> wrote:

>  Perhaps it is a slightly semantical discussion; but all fp calculations
> have errors, and there are always strategies for making them smaller. We
> just don't happen to like the error for this case; but rest assured it
> won't be hard to find new cases of 'blatantly wrong' results, no matter
> what accumulator is implemented. That's no reason to not try and be clever
> about it, but there isn't going to be an algorithm that is best for all
> possible inputs, and in the end the most important thing is that the
> algorithm used is specified in the docs.
>  ------------------------------
> From: Alan G Isaac <alan.is...@gmail.com>
> Sent: ‎25-‎7-‎2014 00:10
>
> To: Discussion of Numerical Python <numpy-discussion@scipy.org>
> Subject: Re: [Numpy-discussion] numpy.mean still broken for
> largefloat32arrays
>
> On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote:
> > This isn't a bug report, but rather a feature request.
>
> I'm not sure statement this is correct.  The mean of a float32 array
> can certainly be computed as a float32.  Currently this is
> not necessarily what happens, not even approximately.
> That feels a lot like a bug, even if we can readily understand
> how the algorithm currently used produces it.  To say whether
> it is a bug or not, don't we have to ask about the intent of `mean`?
> If the intent is to sum and divide, then it is not a bug.
> If the intent is to produce the mean, then it is a bug.
>
> Alan Isaac
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to