[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ronald van Elburg
Mean_var, mean_std and tests are now ready. 
(https://github.com/soundappraisal/numpy/tree/stdmean-dev-001)

Some decisions made during implementation:
  - the output shape of mean follows the output shape of the variance or the 
standard deviation. So it responds in the same way to the keepdims flag as the 
variance and the standard deviation.
  - the resizing of the mean is placed in _mean_var the overhead on the old 
functions std and var is minimal as they set mean_out to None.
  - the intermediate mean used can not be replaced with the mean produced by 
_mean as the output of the latter can not be broadcast to the incoming data.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Matti Picus

On 2/6/23 13:09, Ronald van Elburg wrote:


Mean_var, mean_std and tests are now ready. 
(https://github.com/soundappraisal/numpy/tree/stdmean-dev-001)

Some decisions made during implementation:
   - the output shape of mean follows the output shape of the variance or the 
standard deviation. So it responds in the same way to the keepdims flag as the 
variance and the standard deviation.
   - the resizing of the mean is placed in _mean_var the overhead on the old 
functions std and var is minimal as they set mean_out to None.
   - the intermediate mean used can not be replaced with the mean produced by 
_mean as the output of the latter can not be broadcast to the incoming data.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: matti.pi...@gmail.com



For a previous discussion of a performant solution, see 
https://github.com/numpy/numpy/issues/13199. That issue is more about 
var but also touches on a paper that has a two-pass solution for 
calculating mean and var together


Matti

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Matti Picus


On 2/6/23 13:41, Matti Picus wrote:

On 2/6/23 13:09, Ronald van Elburg wrote:

Mean_var, mean_std and tests are now ready. 
(https://github.com/soundappraisal/numpy/tree/stdmean-dev-001)


Some decisions made during implementation:
   - the output shape of mean follows the output shape of the 
variance or the standard deviation. So it responds in the same way to 
the keepdims flag as the variance and the standard deviation.
   - the resizing of the mean is placed in _mean_var the overhead on 
the old functions std and var is minimal as they set mean_out to None.
   - the intermediate mean used can not be replaced with the mean 
produced by _mean as the output of the latter can not be broadcast to 
the incoming data.

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: matti.pi...@gmail.com



For a previous discussion of a performant solution, see 
https://github.com/numpy/numpy/issues/13199. That issue is more about 
var but also touches on a paper that has a two-pass solution for 
calculating mean and var together


Matti

Ahh, I see that issue is mentioned in your issue 
https://github.com/numpy/numpy/issues/23741


Matti

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ronald van Elburg
I think I left those aspects of the implementation untouched. But having 
someone more experienced look at it is probably a good idea.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ronald van Elburg
Aha, the unnecessary copy mentioned in the  
https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf.
 paper is a copy of the input. Here it is about discarding a valuable output 
(the mean) and then calculating that result separately.

Not throwing the mean away saves about 20% computation time. Or phrased 
differently the calculation of the variance spends about a 25% of the 
computation time on calculating the mean.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Issue 22266(https://github.com/numpy/numpy/issues/22266)

2023-06-02 Thread Mohit Kumar
Dear mentors,

I have been trying to solve this issue(for round_). I found that this(round_) 
is not included into the latest documentation of version 1.24 and it was last 
time introduced into version 1.13 documentation. As I can see round_ is working 
for 1.24.2 and it will be removed in version 2.0 . So, I am curious to know 
that changing the codebase to include it's(round_) example(which is now 
directed with the link of latest version of around) will change documentation 
for version 1.13? If not then we are using 1.24 and so, how it will be seen in 
version 1.24 as round_ is not included in that documentation? Or should be 
remove round_ label from this issue. Guiding/correcting me will be very helpful.
This is how round_ looks like in  documentation 
(https://numpy.org/doc/1.13/reference/generated/numpy.round_.html) it was last 
updated on Jun 10, 2017.

Thanks
Mohit Kumar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Jerome Kieffer
On Fri, 02 Jun 2023 11:47:14 -
"Ronald van Elburg"  wrote:

> Aha, the unnecessary copy mentioned in the  
> https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf.
>  paper is a copy of the input. Here it is about discarding a valuable output 
> (the mean) and then calculating that result separately.

I have been working a lot around this publication and I found it very 
interesting.
Nevertheless, I believe there is a bug when dealing with weighted
averages (eq22) ... but we can discuss offline about it. None of the
author did answer to my comments.

Since the PR is about unweighted means/std, the math exposed there are (very 
likely) correct.

Cheers,

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ralf Gommers
On Fri, Jun 2, 2023 at 1:51 PM Ronald van Elburg 
wrote:

> Aha, the unnecessary copy mentioned in the
> https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf.
> paper is a copy of the input. Here it is about discarding a valuable output
> (the mean) and then calculating that result separately.
>
> Not throwing the mean away saves about 20% computation time. Or phrased
> differently the calculation of the variance spends about a 25% of the
> computation time on calculating the mean.
>

I'm not sure I find that a compelling benefit for introducing dedicated
functions. The work on moving the implementation to C showed >2x speedups (
https://github.com/numpy/numpy/issues/13199#issuecomment-478305730). Is
there a reason for not trying to get that much larger gain merged first or
instead?

Cheers,
Ralf
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ronald van Elburg
I am agnostic to the order of those changes. Also this is my first attempt to 
contribute to numpy, so I am not aware of all the ongoing discussions. I'll try 
to read the issue you just mentioned.

But in the code I rewrote replacing _mean_var with a faster version would 
benefit var, std, mean_var and mean_std because they all call _mean_var. 

The mean function is untouched.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Ronald van Elburg
I had a closer look at the paper. When I have more brain and time I may check 
the mathematics. The focus is however more on streaming data, which is an 
application with completely different demands. I think that here we can not 
afford to sample the data, which is an option in streaming database systems.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com