[Numpy-discussion] ENH: Add "where" argument to reduction functions that are missing it

2024-03-14 Thread Carlos Martin
The following reduction functions receive a "where" argument:
- https://numpy.org/doc/stable/reference/generated/numpy.max.html
- https://numpy.org/doc/stable/reference/generated/numpy.min.html
- https://numpy.org/doc/stable/reference/generated/numpy.sum.html
- https://numpy.org/doc/stable/reference/generated/numpy.prod.html
- https://numpy.org/doc/stable/reference/generated/numpy.mean.html
- https://numpy.org/doc/stable/reference/generated/numpy.var.html
- https://numpy.org/doc/stable/reference/generated/numpy.std.html
and their nan* counterparts, where applicable.

Feature request: Add "where" argument to
- https://numpy.org/doc/stable/reference/generated/numpy.ptp.html
- https://numpy.org/doc/stable/reference/generated/numpy.average.html
- https://numpy.org/doc/stable/reference/generated/numpy.median.html
- https://numpy.org/doc/stable/reference/generated/numpy.quantile.html
- https://numpy.org/doc/stable/reference/generated/numpy.percentile.html
- https://numpy.org/doc/stable/reference/generated/numpy.nanmedian.html
and their nan* counterparts, where applicable.

ptp should be especially straightforward, since it's just max - min.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Improved 2DFFT Approach

2024-03-14 Thread Alexander Levin via NumPy-Discussion
Good day, Ralf.

I am sharing the results of the latest updates on our code. We have taken into 
account the comments below and are testing the timing with %timeit -o inside 
jupyter, having information about the best of 7 code passes and the average 
deviation. Writing to summarise the intermediate results.

The testing notebooks:
Memory Usage - 
https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/memory_usage.ipynb
 
Timing comparisons(updated) - 
https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/comparisons.ipynb
Our version loses to Scipy always if multithreading is enabled, also we 
wondered about type conversions - whether to leave them for test metrics or 
not. The point is that they are necessary for converting matrix values from int 
to complex128 (we will replace them with 64 if necessary) and back when 
outputting. For more convenient user-experience we preferred to leave the 
conversions for testing, we will be interested in your opinion. 

Regarding the results we have after all updates - everything is stable in 
memory, our operation wins by 2 times. Regarding execution time and efficiency 
- I have the following opinion. On tests with multithreading enabled we are 
consistently losing, while on tests with multithreading disabled we are 
consistently winning. From this we should draw one logical conclusion - our 
algorithm is mathematically smarter, which makes it possible for it to win 
steadily within the limits of memory usage and performance when multithreading 
is switched off. At the same time, multithreading itself, used by Scipy 
authors, is better and more efficient than ours - that's why our operation 
loses algorithmically at the moment when it is switched on.

>From this I can conclude that our algorithm is still more performant, but it 
>obviously needs modification of the existing multithreading system. In this 
>situation we need your advice. In theory, we can figure out and write a more 
>efficient and smarter algorithm for multithreading than our current one. In 
>practice, I'm sure the best way forward would be to collaborate with someone 
>responsible for FFT from Scipy or NumPy so that we can test our algorithm with 
>their multithreading, I'm sure this action will give the best possible 
>performance at the moment in general. I propose this option instead of our 
>separate multithreading writing, as the goal of our work is to embed in NumPy 
>so that as many people as possible can use a more efficient algorithm for 
>their work. And if we write our multithreading first, we will then have to 
>switch to the NumPy version to synthesise the package anyway. So I'm asking 
>for your feedback on our memory usage and operation efficiency results to 
>decide toget
 her the next steps of our hopefully collaborative work, if you're interested 
in doing so. Thank you for your time.

Regards, 

Alexander
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Improved 2DFFT Approach

2024-03-14 Thread Stefan van der Walt via NumPy-Discussion

Hi Alexander,

On 2024-03-14 22:43:38, Alexander Levin via NumPy-Discussion 
 wrote:
Memory Usage - 
https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/memory_usage.ipynb 
Timing comparisons(updated) - 
https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/comparisons.ipynb


I see these timings are still done only for power-of-two shaped 
arrays. This is the easiest case to optimize, and I wonder if 
you've given further thought to supporting other sizes? PocketFFT, 
e.g., implements the Bluestein / Chirp-Z algorithm to deal with 
cases where the sizes have large prime factors.


Your test matrix also only contains real values. In that case, you 
can use rfft, which might resolve the memory usage difference? I'd 
be surprized if PocketFFT uses that much more memory for the same 
calculation.


I saw that in the notebook code you have:

matr = np.zeros((n, m), dtype=np.complex64)
matr = np.random.rand(n, m)

Was the intent here to generate a complex random matrix?

Stéfan
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Improved 2DFFT Approach

2024-03-14 Thread Alexander Levin via NumPy-Discussion
Hi Stefan, 

indeed you're right, the underlying formula initially was created by V. 
Tutatchikov for power-of-two matrices. The initial butterfly approach requires 
a recursive breakdown to 2x2 matrix in order to proceed with precalculations of 
roots of unity (exactly what provides you the aforementioned performance 
advantage). 

I did my own research on implementations where the size is not a power of two 
and they still implement the butterfly, usually they proceeded with 0 
imputation to build to closest power of 2 (which is a mess on a bigger sizes) 
or tried dropped the columns and building them back which is also not a 
brilliant solution. At some point I found a patent number US 8.484.274B2, where 
they discussed the possible padding options for such matrices. The methodology 
was picked depending on the actual size of signal/data, therefore, in case you 
proceed with implementing butterfly with such approach, you'd need to write 
several cases and check the size. Theoretically, it's possible, though I can't 
really say if this particular case would give much more performance advantage 
rather than the same or the worse. 

Yep, indeed the intend is to generate a random matrix, so our tests can be 
objective as they can be. I appreciate your review and will also run the memory 
against rfft (i hope tomorrow). 

So for now I'm sure this method shows better performance on size of 
power-of-two square matrices as well as rectangular matrices of size  2^n x 2^m 
(this one was tested during development process). Speaking of other sizes, I 
mentioned above that I have some thoughts, but it's very case sensitive in 
terms of specific type of signal we are working with.  I would like to 
emphasize that regardless of my genuine desire to create the best possible 
version for the user, my work is not limited by my desire but constrained by 
capabilities. As you may have noticed earlier, the multithreading implemented 
by the authors of Scipy surpasses the version created by us. Considering the 
matrix dimension sensitivity of the butterfly method, I am ready to share my 
thoughts regarding specific data sizes and use cases. However, I cannot readily 
provide an optimal solution for all cases, otherwise, I would have implemented 
it first and then presented it to you. At the moment, I believe this is the 
only sig
 nificant vulnerability in the mathematics we have presented. I can't provide 
much feedback on Bluestein / Chirp-Z at this very moment, but I'll research on 
this matter and if in some case it solves our issue - will definitely implement 
it.

At this very point, I believe if our method after thorough provided corrections 
still has some performance advantage (even on matrices of more simple sizes 
like power-of-two) it is worth to embed it even using in 'if-cases' in my 
opinion. Yet i'm only a contributor and that's why i'm discussing this matter 
with you. 

I want to admit I appreciate your feedback and your time and will be waiting 
for your response.

Regards,

Alexander
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com