[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
To eliminate the possibility of being affected by the different versions of 
numpy I have just now upgraded numpy in Python 3.8 environment to the latest 
version, so both 3.8 and 3.10 and using numpy 1.21.4 and still the timing is 
exactly the same.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/THPN4OWM3A335LDO7HVIQSIDFFVO5URZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
Alas, it is exactly the same as previously reported, so the problem persists. 
If it was exactly the same between Python versions I would celebrate and shout 
for joy, seeing that the problem is narrowed down to numpy.

I can carefully upgrade all the other packages in 3.8 to match those in 3.10. 
As I can downgrade (I will test it first), I should be able to restore my 
"superfast 3.8 environment", should this upgrade break it. I will report what I 
discover.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZM7UU6CVMIWEJEXB7V57N4FML2A7RLQ3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
In both cases I installed numpy using "sudo -H pip install numpy". And just now 
I upgraded numpy in 3.8 using "sudo -H pip3.8 install --upgrade numpy".

I will try to simplify the program by removing all the higher level complexity 
and see what I find.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SPI6K4LNO5BFLIUGYBHCMYCXX7FO7YV5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
I think I have found something very interesting. Namely, I removed all 
multiprocessing (which is done in the shell script, not in Python) and so 
reduced the program to just a single thread of execution. And lo and behold, 
Python 3.10 now consistently beats 3.8 by about 5%. However, this is not the 
END! Namely, it is very important to find out why when running multiple 
processes simultaneously 3.8 still outperforms 3.10. The thing is -- all these 
different threads write to completely unrelated data files (.npz and .npy) The 
only thing they all have in common is the initial data, which they all read 
from the same 'init.npz' and 'init_W.npy' files using:

with load(args.ifilename + '.npz', allow_pickle=True) as data:

and

Winit = memmap(iWfilename, dtype='float64', mode='r', shape=(Nt, Nx, Np))

So, could this be the problem?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SMTEEMBDUJ7ZYM6HYOOZXT6NOHJFJIYY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
I have created four different sets of initial data, one for each thread of 
execution and no, unfortunately, that does NOT solve the problem. Still, when 
four threads are executed in parallel, 3.8 outperforms 3.10 by a factor of 2.4. 
So, there is some other point of contention between the threads, which I need 
to find...
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/35QRBPQFN4MOCSADYB4HSTJQXZ2QTSKT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
So far I have narrowed it down to a block of code in solve.py doing a lot of 
multi-threaded FFT (i.e. with fft(..., threads=6) of pyFFTW), as well as numpy 
exp() and other functions and pure Python heavy list manipulation (yes, lists, 
not numpy arrays). All of this together (or some one part of it, yet to be 
discovered) is behaving as if there was some global lock taken behind the scene 
(i.e. inside Python interpreter), so that when multiple instances of the script 
(which I loosely called "threads" in previous posts, but here correct myself as 
the word "threads" is used more appropriately in the context of FFT in this 
message) are executed in parallel, they slow each other down in 3.10, but not 
so in 3.8.

So this is definitely a very interesting 3.10 degradation problem. I will try 
to investigate some more tomorrow...
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTXTX7VBXZTJBIJIX2KMAAOOQDE52R5K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 3.10 vs 3.8 performance degradation

2021-12-19 Thread Tigran Aivazian
I have got it narrowed down to the "threads=6" argument of fft() and ifft() 
functions of pyFFTW! Namely, if I do NOT pass "threads=6" to fft()/iff(), then 
the parallel execution of multiple instances of the scripts is the same in 
Python 3.8 and 3.10. But it is a bit slower than with "threads=6", of course 
(as my "multiprocessing" on the shell script level is tied to the multiple 
physical problems being solved simultaneously and this number is small -- say 
4, but I have 12 processors (6 physical cores) which could execute code in 
parallel).

So, this is where we are right now: the version pyFFTW 0.12.0 on Python 3.8 
with threads=6 is 2.4 times faster than the same version 0.12.0 pyFFTW on 
Python 3.10, when four scripts are executed in parallel. But removing 
"threads=6" makes 3.10 much faster, and 3.8 a bit slower. Though not too slow 
-- instead of 9 vs 23 seconds I get 11.2 (Python 3.8) vs 10.8 (Python 3.10) 
seconds, so Python 3.10 is even a little bit faster than 3.8, but still not as 
fast as with threads=6 on 3.8.

However, that pendulum PyQT GUI application does NOT do any Fourier transforms! 
So, the problem with FPS in pendulum plotting is something different.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LRQIELQV5R5LDDCRRL2VDTS7DKY7OLPT/
Code of Conduct: http://python.org/psf/codeofconduct/