[Python-Dev] Re: Hang with parallel make
More information: The hang happens when building extensions, using the setup.py script. The script determines that the build is parallel (build_ext.py/build_extensions) and creates a thread pool. Each thread then executes a compilation job by fork()ing a compiler process. I don't see how it works on any system that keeps the semaphore state in user mode. --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DGEMKEX37HEIN7MBNSBJH5P2VQVKUEI7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
Sorry, should have posted the backtrace from the beginning. It goes deeper than this, but the important part is in the child after fork(): #0 SyncSemWait () at /builds/workspace/710-SDP/build_x86_64/lib/c/kercalls/x86_64/SyncSemWait.S:37 #1 0x004bfa174ac6 in PyThread_acquire_lock_timed (lock=lock@entry=0x24f0cd9430, microseconds=microseconds@entry=-100, intr_flag=intr_flag@entry=1) at Python/thread_pthread.h:459 #2 0x004bfa1c77bd in acquire_timed (lock=0x24f0cd9430, timeout=-10) at ./Modules/_threadmodule.c:63 #3 0x004bfa1c78e7 in rlock_acquire (self=0x24f13027e0, args=, kwds=) at ./Modules/_threadmodule.c:308 #4 0x004bfa204631 in method_vectorcall_VARARGS_KEYWORDS (func=0x24f0d61ef0, args=0x24f131c660, nargsf=, kwnames=) at Objects/descrobject.c:332 #5 0x004bfa06eff6 in _PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:123 #6 call_function (kwnames=0x0, oparg=1, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987 #7 _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3486 #8 0x004bfa06c95b in function_code_fastcall (co=, args=, nargs=0, globals=) at Objects/call.c:283 #9 0x004bfa06ed29 in _PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:127 #10 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987 #11 _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3500 #12 0x004bfa06c95b in function_code_fastcall (co=, args=, nargs=1, globals=) at Objects/call.c:283 #13 0x004bfa06ed29 in _PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:127 #14 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987 #15 _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3500 #16 0x004bfa06c95b in function_code_fastcall (co=, args=, nargs=1, globals=) at Objects/call.c:283 #17 0x004bfa06eff6 in _PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:123 #18 call_function (kwnames=0x0, oparg=1, pp_stack=, tstate=0x24f0d8ee00) at Python/ceval.c:4987 #19 _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3486 #20 0x004bfa06c95b in function_code_fastcall (co=, args=, nargs=0, globals=) at Objects/call.c:283 #21 0x004bfa08626a in _PyObject_FastCallDict (callable=0x24f152fd30, args=, nargsf=, kwargs=) at Objects/call.c:96 #22 0x004bfa189584 in run_at_forkers (lst=, reverse=) at ./Modules/posixmodule.c:435 #23 0x004bfa194bfb in run_at_forkers (reverse=0, lst=) at ./Modules/posixmodule.c:420 #24 PyOS_AfterFork_Child () at ./Modules/posixmodule.c:474 #25 0x004bfa194d08 in os_fork_impl (module=) at ./Modules/posixmodule.c:6082 #26 os_fork (module=, _unused_ignored=) at ./Modules/clinic/posixmodule.c.h:2709 --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GCCEBGMUDK2UAMCDSW4VRWCUAHGAPLCN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
It's actually not clear to me what lock it is from the core file I took, as rlock_acquire() is called through a function pointer from method_vectorcall_VARARGS_KEYWORDS() (I posted the backtrace separately). My suspicion is that it doesn't fail on macOS because it may keep all of the semaphore's state in the kernel, which means that it is not necessarily inherited on fork(). QNX keeps the count in user mode, in a similar fashion to the way some state is kept in user mode for fast mutexes. I'll see if I can come up with a simple scenario. In the mean time I am trying to switch to posix_spawn() to see if it fixes the problem (and will also be much faster on QNX, as you don't need to create a duplicate address space just to tear it down with exec()). --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XHHRNVTWADJTMHODQUVRH5QF3TLTIT4N/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
A change to posix_spawnp() fixes the problem for me: diff --git a/Lib/distutils/spawn.py b/Lib/distutils/spawn.py index ceb94945dc..cb69de4242 100644 --- a/Lib/distutils/spawn.py +++ b/Lib/distutils/spawn.py @@ -90,7 +90,7 @@ def _spawn_posix(cmd, search_path=1, verbose=0, dry_run=0): return executable = cmd[0] exec_fn = search_path and os.execvp or os.execv -env = None +env = os.environ if sys.platform == 'darwin': global _cfg_target, _cfg_target_split if _cfg_target is None: @@ -112,7 +112,7 @@ def _spawn_posix(cmd, search_path=1, verbose=0, dry_run=0): env = dict(os.environ, MACOSX_DEPLOYMENT_TARGET=cur_target) exec_fn = search_path and os.execvpe or os.execve -pid = os.fork() +pid = os.posix_spawnp(executable, cmd, env) if pid == 0: # in the child try: if env is None: --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3SJ53WLPMMHT7AJUBGBS3ANEMAEAVWAW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
A simple example that reproduces the hang (please keep in mind that I have very little experience writing Python code...): import os from concurrent.futures import ThreadPoolExecutor def new_process(arg): pid = os.fork() if pid == 0: exec_fn("/bin/true", "/bin/true") else: pid, status = os.waitpid(pid, 0) with ThreadPoolExecutor(max_workers=4) as executor: futures = [executor.submit(new_process, None) for i in range(0, 4)] for fut in futures: fut.result() --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZOJNVZJIRJJF6RTEWFJ4HG2KZXYY6CLV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
I believe that the problem is in logging/__init__.py, which registers an atfork() handler for re-initializing its lock. However, as part of this process it attempts to acquire the module lock, which has not been reinitialized and so still reflects the parent's state of the lock. --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XIWDXXWKFDDSEZ7B3IQZZYEI2GO4G774/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Hang with parallel make
Done. Thanks, --Elad ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5YBM6LDLV7HSYG44ZA5CPFJBVXDQRRYD/ Code of Conduct: http://python.org/psf/codeofconduct/