[Python-Dev] Re: Pass the Python thread state to internal C functions
On Wed., 13 Nov. 2019, 8:06 am Victor Stinner, wrote: > Hi, > > Are you ok to modify internal C functions to pass explicitly tstate? > I'll join the chorus of +1's. With the work you've already done to clearly separate the public APIs from the internal ones, it's now much clearer which functions should be accepting an explicit thread state, and which ones should be looking it up implicitly. Cheers, Nick. > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q4IPXMQIM5YRLZLHADUGSUT4ZLXQ6MYY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2019-11-08 - 2019-11-15) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open7148 ( +8) closed 43418 (+59) total 50566 (+67) Open issues with patches: 2820 Issues opened (39) == #38750: Solve IPv4 categorisation issues with the ipaddress module https://bugs.python.org/issue38750 opened by Wicken #38753: AsyncMock not cited as new in 3.8 https://bugs.python.org/issue38753 opened by John Belmonte #38755: Long unicode string causes SyntaxError: Non-UTF-8 code startin https://bugs.python.org/issue38755 opened by Andrew Ushakov #38760: Document for urllib.error.HTTPError.headers Should Specify Wha https://bugs.python.org/issue38760 opened by Han You #38762: Logging displays wrong "processName" if "sys.modules" is clear https://bugs.python.org/issue38762 opened by Delgan #38765: `ast.AST._attributes` is used by `ast.dump()` but not document https://bugs.python.org/issue38765 opened by pekka.klarck #38768: Support lldb enhancement in MacOS https://bugs.python.org/issue38768 opened by Manjusaka #38770: Pickle handle self references in classes https://bugs.python.org/issue38770 opened by Saim Raza #38776: rlock_count<0 https://bugs.python.org/issue38776 opened by toywei #38778: Document that os.fork is not allowed in subinterpreters https://bugs.python.org/issue38778 opened by pconnell #38780: SysLogHandler crash atexit https://bugs.python.org/issue38780 opened by jaraco #38782: Convert importlib.abc to use typing.Protocol https://bugs.python.org/issue38782 opened by brett.cannon #38783: the window size is bigger than the specific size when create a https://bugs.python.org/issue38783 opened by johnliao #38784: ip_network does not clear/update the broadcast_address cache w https://bugs.python.org/issue38784 opened by 992jo #38787: PEP 573: Module State Access from C Extension Methods https://bugs.python.org/issue38787 opened by Dormouse759 #38788: Inconsistent documentation of tell/seek on textiobase/textiowr https://bugs.python.org/issue38788 opened by qha #38789: difflib lacks a way to check if results are empty https://bugs.python.org/issue38789 opened by simon_ #38790: test_fcntl failing on macOS CI https://bugs.python.org/issue38790 opened by steve.dower #38792: IDLE calltips may not properly close on KeyboardInterrupt https://bugs.python.org/issue38792 opened by ZackerySpytz #38793: pathlib.Path.resolve(strict=False) strips final path component https://bugs.python.org/issue38793 opened by lutecki #38794: Setup: support linking openssl statically https://bugs.python.org/issue38794 opened by Lukas.Vacek #38795: test_asyncio.test_subprocess.test_terminate() timed out on AMD https://bugs.python.org/issue38795 opened by vstinner #38796: test_multiprocessing_forkserver: test_mymanager_context() fail https://bugs.python.org/issue38796 opened by vstinner #38797: x86-64 High Sierra 3.x buildbot worker fails to build Python: https://bugs.python.org/issue38797 opened by vstinner #38798: test_asyncio.test_sendfile.test_sendfile_ssl_pre_and_post_data https://bugs.python.org/issue38798 opened by vstinner #38799: race condition in multiprocessing.Pool with maxtasksperchild=1 https://bugs.python.org/issue38799 opened by steve.lori...@gmail.com #38802: Clearer wording of os.WNOHANG documentation to avoid misinterp https://bugs.python.org/issue38802 opened by bbmmy #38803: test_wait3 and test_wait4 leaked references on x86 Gentoo Refl https://bugs.python.org/issue38803 opened by vstinner #38804: Regular Expression Denial of Service in http.cookiejar https://bugs.python.org/issue38804 opened by bc #38805: locale.getlocale() returns a non RFC1766 language code https://bugs.python.org/issue38805 opened by markgrandi #38806: "pdb.Pdb(skip=â¦).set_trace()" should always stop on calling https://bugs.python.org/issue38806 opened by blueyed #38807: Better exception message in os.path.join https://bugs.python.org/issue38807 opened by tomasfarias #38809: On Windows, build scripts should prefer using python.exe from https://bugs.python.org/issue38809 opened by taleinat #38810: SSL connect() raises SSLError "[SSL] EC lib (_ssl.c:728)" https://bugs.python.org/issue38810 opened by andymaier #38811: Pathlib crashes when os module is missing 'link' method https://bugs.python.org/issue38811 opened by tohojo #38812: Comparing datetime.time objects incorrect for TZ aware and una https://bugs.python.org/issue38812 opened by epicadv #38813: math.modf() change integer returned part as integer instead of https://bugs.python.org/issue38813 opened by aikimark1955 #38814: Python3.7.5 crashes on OSX with my django project https://bugs.python.org/issue38814 opened by karolyi #38815: test_ssl: test_min_max_version() fails on AMD64 FreeBSD Shared https://bugs.python.org/issue38815 opened by vstinner
[Python-Dev] Re: Pass the Python thread state to internal C functions
Victor Stinner schrieb am 12.11.19 um 23:03: > Are you ok to modify internal C functions to pass explicitly tstate? FWIW, I started doing the same internally in Cython a while back, because like others, I also considered it wasteful to look it up all over the place, often multiple times inside of one function (usually related to try-finally and exception handling). I think it similarly makes sense inside of CPython. I would also find it reasonable to make it part of a new C-API. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OZMEP27S6Q4OQ4CMCFPSRPM4FGUI2ZHQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Pass the Python thread state to internal C functions
On Tue, Nov 12, 2019 at 3:11 PM Victor Stinner wrote: > Are you ok to modify internal C functions to pass explicitly tstate? I'm also in favor (strongly)! (no surprises there) The only concerns I've heard is that on some platforms there is a measurable overhead once you hit a threshold of a specific small number of parameters. Adding this extra parameter will put some functions over that threshold. I don't have any more information than that. > There are different reasons to pass explicitly tstate, but my main > motivation is to rework Python code base to move away from implicit > global states to states passed explicitly, to implement the PEP 554 > "Multiple Interpreters in the Stdlib". In short, the final goal is to > run multiple isolated Python interpreters in the same process: run > pure Python code on multiple CPUs in parallel with a single process > (whereas multiprocessing runs multiple processes). FTR, PEP 554 is explicitly independent of efforts to stop sharing the GIL between interpreters. I argue there that it is a good idea regardless. The existing functionality the PEP exposes, though, clearly benefits from better isolation between interpreters (including not sharing the GIL). :) On Thu, Nov 14, 2019 at 4:12 AM Victor Stinner wrote: > Another approach would be to pass a "PyContext*" pointer which > contains tstate, but also additional fields. But I chose to state with > a direct "PyThreadState* tstate" to avoid one indirection to every > tstate access. Currently, tstate seems to be enough for the current > code base. FWIW, I favor this approach as well. As long as it is an opaque type, a PyContext allows us to be more flexible in adapting to the future. For now it could even be a simple alias for PyThreadState. Regardless, I'm not convinced that using a PyContext will have a real impact on runtime performance. Also, we already use "context" in a number of ways in Python. So "PyContext" might not be the best name. It probably needs to be a name without "context" in it or one with a concrete clue (e.g. 'PyRuntimeContext"). Anyway, thanks for driving this discussion, Victor! -eric ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/C7EQXGL3RCOLQNBCK7CVRDT52FWJFAVT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Pass the Python thread state to internal C functions
As you know, I'm skeptical that PEP 554 will produce benefits that are worth the effort, but let's assume for the moment that it is, and we're all 100% committed to moving all globals into the threadstate. Even given that, the motivation for this change seems a bit unclear to me. I guess the possible goals are: - Get rid of the "ambient" threadstate entirely - Make accessing the threadstate faster For the first goal, I don't think this is possible, or desirable. Obviously if we remove the GIL somehow then at a minimum we'll need to make the global threadstate a thread-local. But I think we'll always have to keep it around as a thread-local, at least, because there are situations where you simply cannot pass in the threadstate as an argument. One example comes up when doing FFI: there are C libraries that take callbacks, and will run them later in some arbitrary thread. When wrapping these in Python, we need a way to bundle up a Python function into a C function that can be called from any thread. So, ctypes and cffi and cython all have ways to do this bundling, and they all start with some delicate dance to figure out whether or not the current thread holds the GIL, acquiring the GIL if not, then checking whether or not this thread has a Python threadstate assigned, creating it if not, etc. This is completely dependent on having the threadstate available in ambient context. If threadstates were always passed as arguments, then it would become impossible to wrap these C libraries. So we can't do that. That said, it's fine – even if we do remove the GIL, we still won't have a *single OS thread* executing code from two different interpreters at the same time! So storing the threadstate in a thread-local is fine, and we can keep the ability to grab the threadstate at any moment, regardless of whether it was passed as an argument. But that means the only reason for passing the threadstate around as an argument is if it's faster than looking it up. And AFAICT, no-one in this thread actually knows if that's true? You mentioned that there's an "atomic operation" there currently, but I think on x86 at least _Py_atomic_load_relaxed is literally a no-op. Larry did some experiments with the old pthreads thread-local storage API, but no-one seems to have done any measurements on the new, much-faster thread-local storage API, and no-one's done any measurements of the cost of passing around threadstates explicitly. For all we know, passing the threadstate around is actually slower than looking it up every time. And we don't even know yet whether the threadstate even will move into thread-local storage. It seems a bit weird to start doing massive internal refactoring before measuring those things. -n On Tue, Nov 12, 2019 at 2:03 PM Victor Stinner wrote: > > Hi, > > Are you ok to modify internal C functions to pass explicitly tstate? > > -- > > I started to modify internal C functions to pass explicitly "tstate" > when calling C functions: the Python thread state (PyThreadState). > Example of C code (after my changes): > > if (_Py_EnterRecursiveCall(tstate, " while calling a Python object")) > { > return NULL; > } > PyObject *result = (*call)(callable, args, kwargs); > _Py_LeaveRecursiveCall(tstate); > return _Py_CheckFunctionResult(tstate, callable, result, NULL); > > In Python 3.8, the tstate is implicit: > > if (Py_EnterRecursiveCall(" while calling a Python object")) { > return NULL; > } > PyObject *result = (*call)(callable, args, kwargs); > Py_LeaveRecursiveCall(); > return _Py_CheckFunctionResult(callable, result, NULL); > > There are different reasons to pass explicitly tstate, but my main > motivation is to rework Python code base to move away from implicit > global states to states passed explicitly, to implement the PEP 554 > "Multiple Interpreters in the Stdlib". In short, the final goal is to > run multiple isolated Python interpreters in the same process: run > pure Python code on multiple CPUs in parallel with a single process > (whereas multiprocessing runs multiple processes). > > Currently, subinterpreters are a hack: they still share a lot of > things, the code base is not ready to implement isolated interpreters > with one "GIL" (interpreter lock) per interpreter, and to run multiple > interpreters in parallel. Many _PyRuntimeState fields (the global > _PyRuntime variable) should be moved to PyInterpreterState (or maybe > PyThreadState): per interpreter. > > Another simpler but more annoying example are Py_None and Py_True > singletons which are globals. We cannot share these singletons between > interpreters because updating their reference counter would be a > performance bottleneck. If we put a "superglobal-GIL" to ensure that > Py_None reference counter remains consistent, it would basically > "serialize" all threads, rather than running them in parallel. > > The idea of passing tstate t
[Python-Dev] Re: Pass the Python thread state to internal C functions
On Sat., 16 Nov. 2019, 7:29 am Eric Snow, wrote: > On Thu, Nov 14, 2019 at 4:12 AM Victor Stinner > wrote: > > Another approach would be to pass a "PyContext*" pointer which > > contains tstate, but also additional fields. But I chose to state with > > a direct "PyThreadState* tstate" to avoid one indirection to every > > tstate access. Currently, tstate seems to be enough for the current > > code base. > > FWIW, I favor this approach as well. As long as it is an opaque type, > a PyContext allows us to be more flexible in adapting to the future. > For now it could even be a simple alias for PyThreadState. > Regardless, I'm not convinced that using a PyContext will have a real > impact on runtime performance. > > Also, we already use "context" in a number of ways in Python. So > "PyContext" might not be the best name. It probably needs to be a > name without "context" in it or one with a concrete clue (e.g. > 'PyRuntimeContext"). > I think we should just stick with "PyThreadState", as that makes it clear that in normal circumstances, it means "the Python State for the currently running Thread". If a function accepting this parameter needs to call back in to Python code, or invokes a function pointer that might call back into the public C API, it's going to need to enforce that assumption by switching the active thread state if necessary. You can already navigate from the thread state to the interpreter state and runtime state, so it should cover everything that we need. Cheers, Nick. > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2XQQCEYCYKUFEJMSMO324NC3IOBKEOQ4/ Code of Conduct: http://python.org/psf/codeofconduct/