[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Mark Shannon




On 17/03/2020 7:00 pm, Steve Dower wrote:

On 17Mar2020 1803, Chris Angelico wrote:

On Wed, Mar 18, 2020 at 3:50 AM Mark Shannon  wrote:

The accessibility of a thread-local variable is a strict superset of
that of a function-local variable.

Therefore storing the thread state in a thread-local variable is at
least as capable as passing thread-state as a parameter.



And by that logic, globals are even more capable. I don't understand
your point. Isn't the purpose of the tstate parameters to avoid the
problem of being unable to have multiple tstates within the same OS
thread? I think I've missed something here.


You haven't. Separating the Python thread from the "physical" thread is 
indeed the point.


That seems like a strawman argument; maybe I'm misunderstanding Steve.
It seems to me that Steve is implying that using thread-local variables 
somehow prevents that separation. It does not.


The point I'm making is that adding `tstate` parameters everywhere is 
unnecessary. Using a thread local variable is much less intrusive and is 
at least as capable.


Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CLRCMGUC5LO3JOFGB2QG6DCJRTVVC3A4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Anders Munch
Chris Angelico [mailto:ros...@gmail.com]:
> And by that logic, globals are even more capable. I don't understand your
> point. Isn't the purpose of the tstate parameters to avoid the problem of
> being unable to have multiple tstates within the same OS thread? I think I've
> missed something here.

The point is that because threads can't preempt themselves, this:

/*1*/
{
Py_something(other_tstate);
}

and this:

/*2*/
{
   PyInterpreterState* old_tstate = tstate;
   tstate = other_state;
   Py_something();
   tstate = old_tstate;
}

is effectively equivalent, provided tstate is thread-local.  Both will work
equally well from the hypothetical C callback that wants to use a different
subinterpreter.

That wouldn't be true if tstate was process-wide, because that would be a race
condition, some other thread might change tstate.  But if tstate is
thread-local, there's no race condition.

Obviously /*1*/ is cleaner code than /*2*/, but /*2*/ is perfectly sound all 
the same.

regards, Anders
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NAD6DEOU2EYLBKOK3QACSP3MUMJ4NTOH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Antoine Pitrou
On Wed, 18 Mar 2020 13:35:16 +
Anders Munch  wrote:
> Chris Angelico [mailto:ros...@gmail.com]:
> > And by that logic, globals are even more capable. I don't understand your
> > point. Isn't the purpose of the tstate parameters to avoid the problem of
> > being unable to have multiple tstates within the same OS thread? I think 
> > I've
> > missed something here.  
> 
> The point is that because threads can't preempt themselves, this:
> 
> /*1*/
> {
> Py_something(other_tstate);
> }
> 
> and this:
> 
> /*2*/
> {
>PyInterpreterState* old_tstate = tstate;
>tstate = other_state;
>Py_something();
>tstate = old_tstate;
> }
> 
> is effectively equivalent, provided tstate is thread-local.  Both will work
> equally well from the hypothetical C callback that wants to use a different
> subinterpreter.

This example is mixing up the notion of interpreter state and thread
state.

Here is a more realistic example:

struct UserData {
  PyInterpreterState* interp;
};

// Callback given to a third-party C library
void my_c_callback(void* opaque) {
  struct UserData* user_data = (struct UserData*) opaque;
  PyGILState_STATE gstate;
  gstate = PyInterpreter_GILStateEnsure(user_data->interp);

  // ...

  PyInterpreter_GILStateEnsure(user_data->interp, gstate);
}


Of course this implies the existence of a PyInterpreter_GILState API
that currently doesn't exist.  In this model, the "thread-local" thread
state is a *per-interpreter* thread-local.  A *process-wide*
thread-local thread state cannot work if we want -- in some distant
future -- to be able to run several subinterpreters concurrently.


In any case, the amount of disagreement and/or misunderstanding in this
discussion is a strong hint that it needs a PEP to hash things out and
explain them clearly, IMHO.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EKZJYO75QBDKZA4A2RR43PKDYEJRD7HC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Anders Munch
Antoine Pitrou [mailto:solip...@pitrou.net]:
> This example is mixing up the notion of interpreter state and thread state.

Sorry about that, I was making a more general point and not paying so much
attention to the specific names.

The general point is this: A thread-local variable can work as an implied
parameter without creating synchronisation problems.  You do not have to worry
about painting yourself into an architectural corner by using thread-local
variables, because anything you can do with an explicit parameter, you could
also have done with an implied one in thread-local memory.

Correctness is not at stake.  Either approach will work.

regards, Anders
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CUZQ4NYBPY25M5ISJ5JT66XMW7KWP4DJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Barry Scott



> On 17 Mar 2020, at 16:43, Mark Shannon  wrote:
> 
> 
> 
> On 17/03/2020 3:38 pm, Steve Dower wrote:
>> On 17Mar2020 1447, Mark Shannon wrote:
>>> On 16/03/2020 3:04 pm, Victor Stinner wrote:
 In short, the answer is yes.
>>> 
>>> I said "no" then and gave reasons. AFAICT no one has faulted my reasoning.
>> I said "yes" then and was also not faulted.
> 
> I'll do that now then ;)
> 
> The accessibility of a thread-local variable is a strict superset of that of 
> a function-local variable.
> 
> Therefore storing the thread state in a thread-local variable is at least as 
> capable as passing thread-state as a parameter.
> 
>>> Let me reiterate why using a thread-local variable is better than passing 
>>> the thread state down the C stack.
>>> 
>>> 1. Using a thread-local variable for the thread state requires much smaller 
>>> changes to the code base.
>> Using thread-local variables enforces a threading model on the host 
>> application, rather than working with the existing threading model. So 
>> anyone embedding is forced into *significantly* more code as a result.
> 
> Putting a value in a function-local variable enforces stronger restrictions 
> than putting it in a thread-local variable.
> 
> I am proposing that we *don't* change the API. How does that make more work 
> for anyone using the API?

Are you saying that all interpreters can use the same thread-local-variable for 
tstate?

Or do you need N thread-local-variables for N interpreters?

Barry


> 
>> We can (and should) maintain a public-facing API that uses TLS to pass the 
>> current thread state around - we have compatibility constraints. But we can 
>> also add private versions that take the thread state (once you've started 
>> trying/struggling to really embed CPython, you'll happily take a dependency 
>> on "private" APIs).
> 
> Again, I am requesting that we *don't* change the API.
> Not changing the API maintains backwards compatibility better than changing 
> it, surely.
> 
>> If the only available API requires TLS, then you're likely to see the caller 
>> wrap it all up in a function that updates TLS before calling. Or 
>> alternatively, introduce dedicated threads for running Python snippets on, 
>> and all the (dead)locking that results (yes, I've done both).
> 
> All the platforms that we support have thread-local storage.
> If a platform doesn't have threads at all, then thread-local just degenerates 
> to a global.
> 
>> Our goal as core CPython developers should be to sacrifice our own effort to 
>> reduce the effort needed by our users, not to do things that make our own 
>> lives easier but harm them.
> 
> Indeed. We might want to speed Python up a bit as well :)
> 
>>> 2. Using a thread-local variable is less error prone. When passing tstate 
>>> as a parameter, what happens if the tstate argument is from a different 
>>> thread or is NULL? Are you adding checks for those cases?
>>> What are the performance implications of adding those checks?
>> Undefined behaviour is totally acceptable here. We can assert in debug 
>> builds - developers who make use of this can test with debug builds.
> 
> I'm not sure what your point is about undefined behaviour.
> 
>>> 3. Using a thread-local variable is likely to be a little bit faster. 
>>> Passing an argument down the stack increases register pressure and spills.
>>> Accessing a thread-local is slower at the point of access, but the cost is 
>>> incurred only when it is needed, so is cheaper overall.
>> Compilers can optimise parameters/locals in ways that are far more efficient 
>> than they can do for anything stored outside the call stack. Especially for 
>> internal calls. Going through public/exported functions is a little more 
>> restricted in terms of optimisations, but if we identify an issue here then 
>> we can work on that then.
> 
> Please skip the patronizing "how compilers work" stuff.
> I know how register allocators work.
> 
>> [OTHER POST]
>>> Just to be clear, this is what I mean by a thread local variable:
>>> https://godbolt.org/z/dpSo-Q
>> Showing what one particular compiler generates for one particular situation 
>> is terrible information (I won't bother calling it "evidence").
> 
> The particular situation is the use of a thread-local variable, which is the 
> point under discussion.
> 
> Here's the links for clang and MSVC:
> 
> https://godbolt.org/z/YnbbqD
> https://www.godbolt.ms/z/9nQEqf
> 
 One motivation is to ease the implementation of subinterpreters (PEP
 554). But PEP 554 describes more than public API than the
 implementation.
>>> 
>>> I don't see how this eases the implementation of subinterpreters.
>>> Surely it makes it harder by causing merge conflicts.
>> That's a very selfish point-of-view :)
> 
> Why? Merge conflicts are a problem for everyone.
> 
>> It eases it because many more operations need to know the current Python 
>> "thread" in order to access things that used to be globals, such as 
>> PyTypeObject instance

[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Kyle Stanley
Mark Shannon wrote:
> The point I'm making is that adding `tstate` parameters everywhere is
> unnecessary. Using a thread local variable is much less intrusive and is
> at least as capable.

Objectively speaking, what would be the specific difference(s) in behavior
and performance between using a thread-local variable vs passing a tstate
parameter? A more in-depth "pros vs cons" analysis of each approach might
help to make the argument points more clear, as it seems some of the
parties involved in the discussion are talking past each other to some
degree, or at least aren't completely on the same page.

It might also help to find a specific C-API function to benchmark, to show
how substantial the supposed performance differences could be. Intuitively,
it seems like passing around an extra parameter would add some penalty, but
it's not clear to me as to how much that would *realistically* add in
performance cost compared to accessing the threadstate through a
thread-local variable. It seems like it depends on exactly how often
`_ThreadState_GET()` would have to be called, but that's not at all obvious
considering the scope of the changes.

> Using a thread local variable is much less intrusive and is
> at least as capable.

I'm not sure that I'm clear on how the tstate parameter would be
additionally intrusive, considering that the changes only affect the
internal parts of the C-API. I would agree if it were to be arbitrarily
added to public API since it would break backwards compatibility, but that
doesn't seem to be the case here. From the perspective of being intrusive,
it seems like it's adding some code churn and a potential merge conflict to
resolve locally. That seems like an inconvenience to us rather than
something that would be intrusive to users.

Also, regarding the thread-local variable example demonstrated w/ godbolt
(thanks for providing those :-) ), it seems like it would be very clear to
access the *current* threadstate that was created by the *main*
interpreter, but how would you go about resolving the following?:

1) A different threadstate from another interpreter and OS thread
2) A different threadstate created from the same OS thread, but from a
different interpreter (as Antoine mentioned earlier)

>From my understanding, those scenarios seem to be a significant concern in
the context of subinterpreters. If we could also see example(s) which
address those scenarios with a thread-local variable instead of a tstate
parameter, I think it would allow for more objective comparison between
them.

Regards,
Kyle Stanley

On Wed, Mar 18, 2020 at 6:36 AM Mark Shannon  wrote:

>
>
> On 17/03/2020 7:00 pm, Steve Dower wrote:
> > On 17Mar2020 1803, Chris Angelico wrote:
> >> On Wed, Mar 18, 2020 at 3:50 AM Mark Shannon  wrote:
> >>> The accessibility of a thread-local variable is a strict superset of
> >>> that of a function-local variable.
> >>>
> >>> Therefore storing the thread state in a thread-local variable is at
> >>> least as capable as passing thread-state as a parameter.
> >>>
> >>
> >> And by that logic, globals are even more capable. I don't understand
> >> your point. Isn't the purpose of the tstate parameters to avoid the
> >> problem of being unable to have multiple tstates within the same OS
> >> thread? I think I've missed something here.
> >
> > You haven't. Separating the Python thread from the "physical" thread is
> > indeed the point.
>
> That seems like a strawman argument; maybe I'm misunderstanding Steve.
> It seems to me that Steve is implying that using thread-local variables
> somehow prevents that separation. It does not.
>
> The point I'm making is that adding `tstate` parameters everywhere is
> unnecessary. Using a thread local variable is much less intrusive and is
> at least as capable.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/CLRCMGUC5LO3JOFGB2QG6DCJRTVVC3A4/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LI7BD63VNAUTNC4JQ7VU7UZLH7F4LEOC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proliferation of tstate arguments.

2020-03-18 Thread Kyle Stanley
Antoine Pitrou wrote:
> In any case, the amount of disagreement and/or misunderstanding in this
> discussion is a strong hint that it needs a PEP to hash things out and
> explain them clearly, IMHO.

Agreed; a PEP (even if it's just informational) would go a long way in
helping to clear up some misunderstandings. Perhaps something along the
lines of "Supporting concurrent subinterpreters in the C-API" that provides
both a high-level overview and low-level information on some of the
different structures involved and any needed changes [1], such as to
PyThreadState, PyInterpreterState, PyGILState_STATE, etc. Optimally, it
would explain some changes made and needed changes to their public and
private APIs, as well as to the struct members.

I'd be happy to help with something like this (particularly in reviewing
and providing feedback on any unclear parts), but my own current
understanding is likely not strong enough to lead the efforts. In fact, I
very likely am in the camp of having some significant misunderstandings
about the low-level implementation details. My experience with
subinterpreters [2] is mostly limited to attempting to help with some
tricky bugs. I'd call it an "area of interest", but it's very far from an
"area of expertise". ;-)

In the meantime for anyone looking for a basic overview on thread state and
interpreter state, there's some general information documented in
https://docs.python.org/3.9/c-api/init.html#high-level-api.

---

[1] - By "needed changes", I am specifically referring to any changes to
the C-API or internals that were made or are still needed to support
concurrent subinterpreters.

[2] - I have much more experience in the existing areas of concurrency in
the standard library, such as with asyncio and concurrent.futures. Although
it's relevant, it's very different in terms to implementation details.
Also, I'm substantially more interested in the Python parts of the stdlib
rather than the C internals or extension modules, but I certainly have an
interest in learning more.

Regards,
Kyle Stanley

On Wed, Mar 18, 2020 at 11:53 AM Antoine Pitrou  wrote:

> On Wed, 18 Mar 2020 13:35:16 +
> Anders Munch  wrote:
> > Chris Angelico [mailto:ros...@gmail.com]:
> > > And by that logic, globals are even more capable. I don't understand
> your
> > > point. Isn't the purpose of the tstate parameters to avoid the problem
> of
> > > being unable to have multiple tstates within the same OS thread? I
> think I've
> > > missed something here.
> >
> > The point is that because threads can't preempt themselves, this:
> >
> > /*1*/
> > {
> > Py_something(other_tstate);
> > }
> >
> > and this:
> >
> > /*2*/
> > {
> >PyInterpreterState* old_tstate = tstate;
> >tstate = other_state;
> >Py_something();
> >tstate = old_tstate;
> > }
> >
> > is effectively equivalent, provided tstate is thread-local.  Both will
> work
> > equally well from the hypothetical C callback that wants to use a
> different
> > subinterpreter.
>
> This example is mixing up the notion of interpreter state and thread
> state.
>
> Here is a more realistic example:
>
> struct UserData {
>   PyInterpreterState* interp;
> };
>
> // Callback given to a third-party C library
> void my_c_callback(void* opaque) {
>   struct UserData* user_data = (struct UserData*) opaque;
>   PyGILState_STATE gstate;
>   gstate = PyInterpreter_GILStateEnsure(user_data->interp);
>
>   // ...
>
>   PyInterpreter_GILStateEnsure(user_data->interp, gstate);
> }
>
>
> Of course this implies the existence of a PyInterpreter_GILState API
> that currently doesn't exist.  In this model, the "thread-local" thread
> state is a *per-interpreter* thread-local.  A *process-wide*
> thread-local thread state cannot work if we want -- in some distant
> future -- to be able to run several subinterpreters concurrently.
>
>
> In any case, the amount of disagreement and/or misunderstanding in this
> discussion is a strong hint that it needs a PEP to hash things out and
> explain them clearly, IMHO.
>
> Regards
>
> Antoine.
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/EKZJYO75QBDKZA4A2RR43PKDYEJRD7HC/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T4EC2LXRCGTIN222HNILLVKUSAOZ3BGA/
Code of Conduct: http://python.org/psf/codeofconduct/