[Numpy-discussion] ENH: Add "where" argument to reduction functions that are missing it
The following reduction functions receive a "where" argument: - https://numpy.org/doc/stable/reference/generated/numpy.max.html - https://numpy.org/doc/stable/reference/generated/numpy.min.html - https://numpy.org/doc/stable/reference/generated/numpy.sum.html - https://numpy.org/doc/stable/reference/generated/numpy.prod.html - https://numpy.org/doc/stable/reference/generated/numpy.mean.html - https://numpy.org/doc/stable/reference/generated/numpy.var.html - https://numpy.org/doc/stable/reference/generated/numpy.std.html and their nan* counterparts, where applicable. Feature request: Add "where" argument to - https://numpy.org/doc/stable/reference/generated/numpy.ptp.html - https://numpy.org/doc/stable/reference/generated/numpy.average.html - https://numpy.org/doc/stable/reference/generated/numpy.median.html - https://numpy.org/doc/stable/reference/generated/numpy.quantile.html - https://numpy.org/doc/stable/reference/generated/numpy.percentile.html - https://numpy.org/doc/stable/reference/generated/numpy.nanmedian.html and their nan* counterparts, where applicable. ptp should be especially straightforward, since it's just max - min. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Improved 2DFFT Approach
Good day, Ralf. I am sharing the results of the latest updates on our code. We have taken into account the comments below and are testing the timing with %timeit -o inside jupyter, having information about the best of 7 code passes and the average deviation. Writing to summarise the intermediate results. The testing notebooks: Memory Usage - https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/memory_usage.ipynb Timing comparisons(updated) - https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/comparisons.ipynb Our version loses to Scipy always if multithreading is enabled, also we wondered about type conversions - whether to leave them for test metrics or not. The point is that they are necessary for converting matrix values from int to complex128 (we will replace them with 64 if necessary) and back when outputting. For more convenient user-experience we preferred to leave the conversions for testing, we will be interested in your opinion. Regarding the results we have after all updates - everything is stable in memory, our operation wins by 2 times. Regarding execution time and efficiency - I have the following opinion. On tests with multithreading enabled we are consistently losing, while on tests with multithreading disabled we are consistently winning. From this we should draw one logical conclusion - our algorithm is mathematically smarter, which makes it possible for it to win steadily within the limits of memory usage and performance when multithreading is switched off. At the same time, multithreading itself, used by Scipy authors, is better and more efficient than ours - that's why our operation loses algorithmically at the moment when it is switched on. >From this I can conclude that our algorithm is still more performant, but it >obviously needs modification of the existing multithreading system. In this >situation we need your advice. In theory, we can figure out and write a more >efficient and smarter algorithm for multithreading than our current one. In >practice, I'm sure the best way forward would be to collaborate with someone >responsible for FFT from Scipy or NumPy so that we can test our algorithm with >their multithreading, I'm sure this action will give the best possible >performance at the moment in general. I propose this option instead of our >separate multithreading writing, as the goal of our work is to embed in NumPy >so that as many people as possible can use a more efficient algorithm for >their work. And if we write our multithreading first, we will then have to >switch to the NumPy version to synthesise the package anyway. So I'm asking >for your feedback on our memory usage and operation efficiency results to >decide toget her the next steps of our hopefully collaborative work, if you're interested in doing so. Thank you for your time. Regards, Alexander ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Improved 2DFFT Approach
Hi Alexander, On 2024-03-14 22:43:38, Alexander Levin via NumPy-Discussion wrote: Memory Usage - https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/memory_usage.ipynb Timing comparisons(updated) - https://github.com/2D-FFT-Project/2d-fft/blob/testnotebook/notebooks/comparisons.ipynb I see these timings are still done only for power-of-two shaped arrays. This is the easiest case to optimize, and I wonder if you've given further thought to supporting other sizes? PocketFFT, e.g., implements the Bluestein / Chirp-Z algorithm to deal with cases where the sizes have large prime factors. Your test matrix also only contains real values. In that case, you can use rfft, which might resolve the memory usage difference? I'd be surprized if PocketFFT uses that much more memory for the same calculation. I saw that in the notebook code you have: matr = np.zeros((n, m), dtype=np.complex64) matr = np.random.rand(n, m) Was the intent here to generate a complex random matrix? Stéfan ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Improved 2DFFT Approach
Hi Stefan, indeed you're right, the underlying formula initially was created by V. Tutatchikov for power-of-two matrices. The initial butterfly approach requires a recursive breakdown to 2x2 matrix in order to proceed with precalculations of roots of unity (exactly what provides you the aforementioned performance advantage). I did my own research on implementations where the size is not a power of two and they still implement the butterfly, usually they proceeded with 0 imputation to build to closest power of 2 (which is a mess on a bigger sizes) or tried dropped the columns and building them back which is also not a brilliant solution. At some point I found a patent number US 8.484.274B2, where they discussed the possible padding options for such matrices. The methodology was picked depending on the actual size of signal/data, therefore, in case you proceed with implementing butterfly with such approach, you'd need to write several cases and check the size. Theoretically, it's possible, though I can't really say if this particular case would give much more performance advantage rather than the same or the worse. Yep, indeed the intend is to generate a random matrix, so our tests can be objective as they can be. I appreciate your review and will also run the memory against rfft (i hope tomorrow). So for now I'm sure this method shows better performance on size of power-of-two square matrices as well as rectangular matrices of size 2^n x 2^m (this one was tested during development process). Speaking of other sizes, I mentioned above that I have some thoughts, but it's very case sensitive in terms of specific type of signal we are working with. I would like to emphasize that regardless of my genuine desire to create the best possible version for the user, my work is not limited by my desire but constrained by capabilities. As you may have noticed earlier, the multithreading implemented by the authors of Scipy surpasses the version created by us. Considering the matrix dimension sensitivity of the butterfly method, I am ready to share my thoughts regarding specific data sizes and use cases. However, I cannot readily provide an optimal solution for all cases, otherwise, I would have implemented it first and then presented it to you. At the moment, I believe this is the only sig nificant vulnerability in the mathematics we have presented. I can't provide much feedback on Bluestein / Chirp-Z at this very moment, but I'll research on this matter and if in some case it solves our issue - will definitely implement it. At this very point, I believe if our method after thorough provided corrections still has some performance advantage (even on matrices of more simple sizes like power-of-two) it is worth to embed it even using in 'if-cases' in my opinion. Yet i'm only a contributor and that's why i'm discussing this matter with you. I want to admit I appreciate your feedback and your time and will be waiting for your response. Regards, Alexander ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com