[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
More information:
The hang happens when building extensions, using the setup.py script. The 
script determines that the build is parallel (build_ext.py/build_extensions) 
and creates a thread pool. Each thread then executes a compilation job by 
fork()ing a compiler process.

I don't see how it works on any system that keeps the semaphore state in user 
mode.

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DGEMKEX37HEIN7MBNSBJH5P2VQVKUEI7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
Sorry, should have posted the backtrace from the beginning. It goes deeper than 
this, but the important part is in the child after fork():

#0  SyncSemWait () at 
/builds/workspace/710-SDP/build_x86_64/lib/c/kercalls/x86_64/SyncSemWait.S:37
#1  0x004bfa174ac6 in PyThread_acquire_lock_timed 
(lock=lock@entry=0x24f0cd9430, microseconds=microseconds@entry=-100, 
intr_flag=intr_flag@entry=1) at Python/thread_pthread.h:459
#2  0x004bfa1c77bd in acquire_timed (lock=0x24f0cd9430, 
timeout=-10) at ./Modules/_threadmodule.c:63
#3  0x004bfa1c78e7 in rlock_acquire (self=0x24f13027e0, args=, kwds=) at ./Modules/_threadmodule.c:308
#4  0x004bfa204631 in method_vectorcall_VARARGS_KEYWORDS 
(func=0x24f0d61ef0, args=0x24f131c660, nargsf=, 
kwnames=)
at Objects/descrobject.c:332
#5  0x004bfa06eff6 in _PyObject_Vectorcall (kwnames=, 
nargsf=, args=, callable=)
at ./Include/cpython/abstract.h:123
#6  call_function (kwnames=0x0, oparg=1, pp_stack=, 
tstate=0x24f0d8ee00) at Python/ceval.c:4987
#7  _PyEval_EvalFrameDefault (f=, throwflag=) at 
Python/ceval.c:3486
#8  0x004bfa06c95b in function_code_fastcall (co=, 
args=, nargs=0, globals=) at Objects/call.c:283
#9  0x004bfa06ed29 in _PyObject_Vectorcall (kwnames=, 
nargsf=, args=, callable=)
at ./Include/cpython/abstract.h:127
#10 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987
#11 _PyEval_EvalFrameDefault (f=, throwflag=) at 
Python/ceval.c:3500
#12 0x004bfa06c95b in function_code_fastcall (co=, 
args=, nargs=1, globals=) at Objects/call.c:283
#13 0x004bfa06ed29 in _PyObject_Vectorcall (kwnames=, 
nargsf=, args=, callable=)
at ./Include/cpython/abstract.h:127
#14 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987
#15 _PyEval_EvalFrameDefault (f=, throwflag=) at 
Python/ceval.c:3500
#16 0x004bfa06c95b in function_code_fastcall (co=, 
args=, nargs=1, globals=) at Objects/call.c:283
#17 0x004bfa06eff6 in _PyObject_Vectorcall (kwnames=, 
nargsf=, args=, callable=)
at ./Include/cpython/abstract.h:123
#18 call_function (kwnames=0x0, oparg=1, pp_stack=, 
tstate=0x24f0d8ee00) at Python/ceval.c:4987
#19 _PyEval_EvalFrameDefault (f=, throwflag=) at 
Python/ceval.c:3486
#20 0x004bfa06c95b in function_code_fastcall (co=, 
args=, nargs=0, globals=) at Objects/call.c:283
#21 0x004bfa08626a in _PyObject_FastCallDict (callable=0x24f152fd30, 
args=, nargsf=, kwargs=)
at Objects/call.c:96
#22 0x004bfa189584 in run_at_forkers (lst=, 
reverse=) at ./Modules/posixmodule.c:435
#23 0x004bfa194bfb in run_at_forkers (reverse=0, lst=) at 
./Modules/posixmodule.c:420
#24 PyOS_AfterFork_Child () at ./Modules/posixmodule.c:474
#25 0x004bfa194d08 in os_fork_impl (module=) at 
./Modules/posixmodule.c:6082
#26 os_fork (module=, _unused_ignored=) at 
./Modules/clinic/posixmodule.c.h:2709

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GCCEBGMUDK2UAMCDSW4VRWCUAHGAPLCN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
It's actually not clear to me what lock it is from the core file I took, as 
rlock_acquire() is called through a function pointer from 
method_vectorcall_VARARGS_KEYWORDS() (I posted the backtrace separately).

My suspicion is that it doesn't fail on macOS because it may keep all of the 
semaphore's state in the kernel, which means that it is not necessarily 
inherited on fork(). QNX keeps the count in user mode, in a similar fashion to 
the way some state is kept in user mode for fast mutexes.

I'll see if I can come up with a simple scenario. In the mean time I am trying 
to switch to posix_spawn() to see if it fixes the problem (and will also be 
much faster on QNX, as you don't need to create a duplicate address space just 
to tear it down with exec()).

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XHHRNVTWADJTMHODQUVRH5QF3TLTIT4N/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
A change to posix_spawnp() fixes the problem for me:

diff --git a/Lib/distutils/spawn.py b/Lib/distutils/spawn.py
index ceb94945dc..cb69de4242 100644
--- a/Lib/distutils/spawn.py
+++ b/Lib/distutils/spawn.py
@@ -90,7 +90,7 @@ def _spawn_posix(cmd, search_path=1, verbose=0, 
dry_run=0):
 return
 executable = cmd[0]
 exec_fn = search_path and os.execvp or os.execv
-env = None
+env = os.environ
 if sys.platform == 'darwin':
 global _cfg_target, _cfg_target_split
 if _cfg_target is None:
@@ -112,7 +112,7 @@ def _spawn_posix(cmd, search_path=1, verbose=0, 
dry_run=0):
 env = dict(os.environ,
MACOSX_DEPLOYMENT_TARGET=cur_target)
 exec_fn = search_path and os.execvpe or os.execve
-pid = os.fork()
+pid = os.posix_spawnp(executable, cmd, env)
 if pid == 0: # in the child
 try:
 if env is None:

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3SJ53WLPMMHT7AJUBGBS3ANEMAEAVWAW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
A simple example that reproduces the hang (please keep in mind that I have very 
little experience writing Python code...):

import os
from concurrent.futures import ThreadPoolExecutor

def new_process(arg):
pid = os.fork()
if pid == 0:
exec_fn("/bin/true", "/bin/true")
else:
pid, status = os.waitpid(pid, 0)

with ThreadPoolExecutor(max_workers=4) as executor:
futures = [executor.submit(new_process, None)
   for i in range(0, 4)]
for fut in futures:
fut.result()

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZOJNVZJIRJJF6RTEWFJ4HG2KZXYY6CLV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
I believe that the problem is in logging/__init__.py, which registers an 
atfork() handler for re-initializing its lock. However, as part of this process 
it attempts to acquire the module lock, which has not been reinitialized and so 
still reflects the parent's state of the lock.

--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XIWDXXWKFDDSEZ7B3IQZZYEI2GO4G774/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Hang with parallel make

2020-02-26 Thread Elad Lahav
Done.

Thanks,
--Elad
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5YBM6LDLV7HSYG44ZA5CPFJBVXDQRRYD/
Code of Conduct: http://python.org/psf/codeofconduct/