[Python-Dev] Re: Pass the Python thread state to internal C functions
Le jeu. 14 nov. 2019 à 04:55, Larry Hastings a écrit : > I'm pretty sure you understand the sentence "Pulling it out of TLS was too > slow". At the time CPython used the POSIX APIs for accessing thread local > storage, and I didn't know about and therefore did not try this "__thread" > GCC extension. I do remember trying some other API that was purported to be > faster--maybe a GCC library function for faster TLS access?--but I didn't get > that to work either before I gave up on it out of frustration. I asked for confirmation, since I was surprised. But when I looked at assembly with my friend, we played with __thread not with pthread_getspecific(). So thanks for confirming that "getting tstate" can be a performance bottleneck: that's a very good reason to pass it explicitly. > I also took the opportunity to pass my "reference count manager" data as a > separate parameter, which again was per-thread and again was a major win at > the time. Another approach would be to pass a "PyContext*" pointer which contains tstate, but also additional fields. But I chose to state with a direct "PyThreadState* tstate" to avoid one indirection to every tstate access. Currently, tstate seems to be enough for the current code base. Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SHPPBIERUHCAH5UFW6WAVOQ2Z2NEKAH3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Pass the Python thread state to internal C functions
On Wed, 13 Nov 2019 14:52:32 +0100 Victor Stinner wrote: > > #define _PyRuntimeState_GetThreadState(runtime) \ > > ((PyThreadState*)_Py_atomic_load_relaxed(&(runtime)->gilstate.tstate_current)) > #define _PyThreadState_GET() _PyRuntimeState_GetThreadState(&_PyRuntime) > > _PyThreadState_GET() uses "_Py_atomic_load_relaxed". I'm not used to > C99 atomic conventions. The "memory_order_relaxed" documentation says: > > "Relaxed operation: there are no synchronization or ordering > constraints imposed on other reads or writes, only this operation's > atomicity is guaranteed (see Relaxed ordering below)" > > Note: I'm not even sure why Python currently uses an atomic operation. Is it protected by a lock? If not, you need to use an atomic. Since it's theoretically possible to read the current thread state without the GIL held (though not very useful), then an atomic is required. Regards Antoine. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7VL3QKACQLDL3QCWKCTUHUCIERFNE6R7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Pass the Python thread state to internal C functions
On Thu, Nov 14, 2019, at 07:43, Antoine Pitrou wrote: > On Wed, 13 Nov 2019 14:52:32 +0100 > Victor Stinner wrote: > > > > #define _PyRuntimeState_GetThreadState(runtime) \ > > > > ((PyThreadState*)_Py_atomic_load_relaxed(&(runtime)->gilstate.tstate_current)) > > #define _PyThreadState_GET() _PyRuntimeState_GetThreadState(&_PyRuntime) > > > > _PyThreadState_GET() uses "_Py_atomic_load_relaxed". I'm not used to > > C99 atomic conventions. The "memory_order_relaxed" documentation says: > > > > "Relaxed operation: there are no synchronization or ordering > > constraints imposed on other reads or writes, only this operation's > > atomicity is guaranteed (see Relaxed ordering below)" > > > > Note: I'm not even sure why Python currently uses an atomic operation. > > Is it protected by a lock? If not, you need to use an atomic. > Since it's theoretically possible to read the current thread state > without the GIL held (though not very useful), then an atomic is > required. It sounds like you are saying PyRuntimeState_GetThreadState has two duties, then: "get this thread's thread state" (from the GIL holder - how do other threads get their own thread state), and "get the GIL-holding thread's thread state (from non-GIL holder thread). The former shouldn't need atomic/overhead locking (unless the thread state can be written from other threads), even if the latter does. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4BNRSO47Z54MRR3ZS32W6DXYRVZ7U53W/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Pass the Python thread state to internal C functions
On 13Nov2019 1954, Larry Hastings wrote: On 11/13/19 5:52 AM, Victor Stinner wrote: Le mer. 13 nov. 2019 à 14:28, Larry Hastings a écrit : I did exactly that in the Gilectomy prototype. Pulling it out of TLS was too slow, What do you mean? Getting tstate from a TLS was a performance bottleneck by itself? Reading a TLS variable seems to be quite efficient. I'm pretty sure you understand the sentence "Pulling it out of TLS was too slow". At the time CPython used the POSIX APIs for accessing thread local storage, and I didn't know about and therefore did not try this "__thread" GCC extension. I do remember trying some other API that was purported to be faster--maybe a GCC library function for faster TLS access?--but I didn't get that to work either before I gave up on it out of frustration. Also, I dimly recall that I moved several things from globals into the ThreadState structure, and probably added one or two of my own. So nearly every function call was referencing ThreadState at one point or another. Passing it as a parameter was a definite win over calling the POSIX TLS APIs. Passing it as a parameter is also a huge win for embedders, as it gets very complicated to merge locking/threading models when the host application has its own requirements. Overall, I'm very supportive of passing context through parameters rather than implicitly through TLS. (Though we've got a long way to go before it'll be possible for embedders to not be held hostage by CPython's threading model... one step at a time! :) ) Cheers, Steve ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TLMDK7JZQIUWQUUKFHOPNEFQCJKFL5JM/ Code of Conduct: http://python.org/psf/codeofconduct/