[Numpy-discussion] Re: mean_std function returning both mean and std
Mean_var, mean_std and tests are now ready. (https://github.com/soundappraisal/numpy/tree/stdmean-dev-001) Some decisions made during implementation: - the output shape of mean follows the output shape of the variance or the standard deviation. So it responds in the same way to the keepdims flag as the variance and the standard deviation. - the resizing of the mean is placed in _mean_var the overhead on the old functions std and var is minimal as they set mean_out to None. - the intermediate mean used can not be replaced with the mean produced by _mean as the output of the latter can not be broadcast to the incoming data. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
On 2/6/23 13:09, Ronald van Elburg wrote: Mean_var, mean_std and tests are now ready. (https://github.com/soundappraisal/numpy/tree/stdmean-dev-001) Some decisions made during implementation: - the output shape of mean follows the output shape of the variance or the standard deviation. So it responds in the same way to the keepdims flag as the variance and the standard deviation. - the resizing of the mean is placed in _mean_var the overhead on the old functions std and var is minimal as they set mean_out to None. - the intermediate mean used can not be replaced with the mean produced by _mean as the output of the latter can not be broadcast to the incoming data. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: matti.pi...@gmail.com For a previous discussion of a performant solution, see https://github.com/numpy/numpy/issues/13199. That issue is more about var but also touches on a paper that has a two-pass solution for calculating mean and var together Matti ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
On 2/6/23 13:41, Matti Picus wrote: On 2/6/23 13:09, Ronald van Elburg wrote: Mean_var, mean_std and tests are now ready. (https://github.com/soundappraisal/numpy/tree/stdmean-dev-001) Some decisions made during implementation: - the output shape of mean follows the output shape of the variance or the standard deviation. So it responds in the same way to the keepdims flag as the variance and the standard deviation. - the resizing of the mean is placed in _mean_var the overhead on the old functions std and var is minimal as they set mean_out to None. - the intermediate mean used can not be replaced with the mean produced by _mean as the output of the latter can not be broadcast to the incoming data. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: matti.pi...@gmail.com For a previous discussion of a performant solution, see https://github.com/numpy/numpy/issues/13199. That issue is more about var but also touches on a paper that has a two-pass solution for calculating mean and var together Matti Ahh, I see that issue is mentioned in your issue https://github.com/numpy/numpy/issues/23741 Matti ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
I think I left those aspects of the implementation untouched. But having someone more experienced look at it is probably a good idea. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
Aha, the unnecessary copy mentioned in the https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf. paper is a copy of the input. Here it is about discarding a valuable output (the mean) and then calculating that result separately. Not throwing the mean away saves about 20% computation time. Or phrased differently the calculation of the variance spends about a 25% of the computation time on calculating the mean. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Issue 22266(https://github.com/numpy/numpy/issues/22266)
Dear mentors, I have been trying to solve this issue(for round_). I found that this(round_) is not included into the latest documentation of version 1.24 and it was last time introduced into version 1.13 documentation. As I can see round_ is working for 1.24.2 and it will be removed in version 2.0 . So, I am curious to know that changing the codebase to include it's(round_) example(which is now directed with the link of latest version of around) will change documentation for version 1.13? If not then we are using 1.24 and so, how it will be seen in version 1.24 as round_ is not included in that documentation? Or should be remove round_ label from this issue. Guiding/correcting me will be very helpful. This is how round_ looks like in documentation (https://numpy.org/doc/1.13/reference/generated/numpy.round_.html) it was last updated on Jun 10, 2017. Thanks Mohit Kumar ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
On Fri, 02 Jun 2023 11:47:14 - "Ronald van Elburg" wrote: > Aha, the unnecessary copy mentioned in the > https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf. > paper is a copy of the input. Here it is about discarding a valuable output > (the mean) and then calculating that result separately. I have been working a lot around this publication and I found it very interesting. Nevertheless, I believe there is a bug when dealing with weighted averages (eq22) ... but we can discuss offline about it. None of the author did answer to my comments. Since the PR is about unweighted means/std, the math exposed there are (very likely) correct. Cheers, -- Jérôme Kieffer tel +33 476 882 445 ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
On Fri, Jun 2, 2023 at 1:51 PM Ronald van Elburg wrote: > Aha, the unnecessary copy mentioned in the > https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf. > paper is a copy of the input. Here it is about discarding a valuable output > (the mean) and then calculating that result separately. > > Not throwing the mean away saves about 20% computation time. Or phrased > differently the calculation of the variance spends about a 25% of the > computation time on calculating the mean. > I'm not sure I find that a compelling benefit for introducing dedicated functions. The work on moving the implementation to C showed >2x speedups ( https://github.com/numpy/numpy/issues/13199#issuecomment-478305730). Is there a reason for not trying to get that much larger gain merged first or instead? Cheers, Ralf ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
I am agnostic to the order of those changes. Also this is my first attempt to contribute to numpy, so I am not aware of all the ongoing discussions. I'll try to read the issue you just mentioned. But in the code I rewrote replacing _mean_var with a faster version would benefit var, std, mean_var and mean_std because they all call _mean_var. The mean function is untouched. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: mean_std function returning both mean and std
I had a closer look at the paper. When I have more brain and time I may check the mathematics. The focus is however more on streaming data, which is an application with completely different demands. I think that here we can not afford to sample the data, which is an option in streaming database systems. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com