On Fri, Jul 25, 2014 at 5:56 PM, RayS <r...@blue-cove.com> wrote: > The important point was that it would be best if all of the methods affected > by summing 32 bit floats with 32 bit accumulators had the same Notes as > numpy.mean(). We went through a lot of code yesterday, assuming that any > numpy or Scipy.stats functions that use accumulators suffer the same issue, > whether noted or not, and found it true.
Do you have a list of the functions that are affected? > "Depending on the input data, this can cause the results to be inaccurate, > especially for float32 (see example below). Specifying a higher-precision > accumulator using the dtype keyword can alleviate this issue." seems rather > un-Pythonic. It's true that in its full generality, this problem just isn't something numpy can solve. Using float32 is extremely dangerous and should not be attempted unless you're prepared to seriously analyze all your code for numeric stability; IME it often runs into problems in practice, in any number of ways. Remember that it only has as much precision as a 24 bit integer. There are good reasons why float64 is the default! That said, it does seem that np.mean could be implemented better than it is, even given float32's inherent limitations. If anyone wants to implement better algorithms for computing the mean, variance, sums, etc., then we would love to add them to numpy. I'd suggest implementing them as gufuncs -- there are examples of defining gufuncs in numpy/linalg/umath_linalg.c.src. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion