Re: [Numpy-discussion] numpy.mean still broken for large float32 arrays

Eelco Hoogendoorn Thu, 24 Jul 2014 03:00:08 -0700

Arguably, this isn't a problem of numpy, but of programmers being trained
to think of floating point numbers as 'real' numbers, rather than just a
finite number of states with a funny distribution over the number line.
np.mean isn't broken; your understanding of floating point number is.


What you appear to wish for is a silent upcasting of the accumulated
result. This is often performed in reducing operations, but I can imagine
it runs into trouble for nd-arrays. After all, if I have a huge array that
I want to reduce over a very short axis, upcasting might be very
undesirable; it wouldn't buy me any extra precision, but it would increase
memory use from 'huge' to 'even more huge'.

np.mean has a kwarg that allows you to explicitly choose the dtype of the
accumulant. X.mean(dtype=np.float64)==1.0. Personally, I have a distaste
for implicit behavior, unless the rule is simple and there really can be no
negative downsides; which doesn't apply here I would argue. Perhaps when
reducing an array completely to a single value, there is no harm in
upcasting to the maximum machine precision; but that becomes a rather
complex rule which would work out differently for different machines. Its
better to be confronted with the limitations of floating point numbers
earlier, rather than later when you want to distribute your work and run
into subtle bugs on other peoples computers.

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.mean still broken for large float32 arrays

Reply via email to