On Tue, Oct 17, 2017 at 9:40 PM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On 18 October 2017 at 05:55, Yury Selivanov <yselivanov...@gmail.com> > wrote: > >> I actually like what you did in >> https://github.com/gvanrossum/pep550/blob/master/simpler.py, it seems >> reasonable. The only thing that I'd change is to remove "set_ctx" >> from the public API and add "Context.run(callable)". This makes the >> API more flexible to potential future changes and amendments. >> > > Yep, with that tweak, I like Guido's suggested API as well. > I've added the suggested Context.run() method. > > Attempting to explain why I think we want "Context.run(callable)" rather > "context_vars.set_ctx()" by drawing an analogy to thread local storage: > > 1. In C, the compiler & CPU work together to ensure you can't access > another thread's thread locals. > But why is that so important? I wouldn't recommend doing it, but it might be handy for a debugger to be able to inspect a thread's thread-locals. As it is, it seems a debugger can only access thread-locals for the thread in which the debugger itself runs. It has better access to the real locals on the thread's stack of frames! > 2. In Python's thread locals API, we do the same thing: you can only get > access to the running thread's thread locals, not anyone else's > But there's no real benefit in this. In C, I could imagine a compiler optimizing access to thread-locals, but in Python that's moot. > At the Python API layer, we don't expose the ability to switch explicitly > to another thread state while remaining within the current function. > Instead, we only offer two options: starting a new thread, and waiting for > a thread to finish execution. The lifecycle of the thread local storage is > then intrinsically linked to the lifecycle of the thread it belongs to. > To me this feels more a side-effect of the implementation (perhaps inherited from C's implementation) than an intentional design. To be clear, I think it's totally fine for *clients* of the ContextVar API -- e.g. numpy or decimal -- to assume that their context doesn't change arbitrarily while they're happily executing in a single frame or calling stuff they trust not to change the context. (IOW all changes to a particular ContextVar would be through that ContextVar object, not through behind-the-scenes manipulation of the thread's current context). But for *frameworks* (e.g. asyncio or Twisted) I find it simpler to think about the context in terms of `set_ctx` and `get_ctx`, and I worry that *hiding* these might block off certain API design patterns that some framework might want to use -- who knows, maybe Nathaniel (who is fond of `with` <http://trio.readthedocs.io/en/latest/reference-core.html#a-simple-timeout-example>) might come up with a context manager to run a block of code in a different context (perhaps cloned from the current one). > That intrinsic link makes various aspects of thread local storage easier > to reason about, since the active thread state can't change in the middle > of a running function - even if the current thread gets suspended by the > OS, resuming the function also implies resuming the original thread. > I don't feel reasoning would be much impaired. When reasoning about code we make assumptions that are theoretically unsafe all the time (e.g. "nobody will move the clock back"). > Including a "contextvars.set_ctx" API would be akin to making > PyThreadState_Swap a public Python-level API, rather than only exposing > _thread.start_new_thread the way we do now. > It's different for threads, because they are the bedrock of execution, and nobody is interested in implementing their own threading framework that doesn't build on this same bedrock. > One reason we *don't* do that is because it would make thread locals much > harder to reason about - every function call could have an implicit side > effect of changing the active thread state, which would mean the thread > locals at the start of the function could differ from those at the end of > the function, even if the function itself didn't do anything to change them. > Hm. Threads are still hard to reason about, because for everything *but* thread-locals there is always the possibility that it's being mutated by another thread... So I don't think we should get our knickers twisted over thread-local variables. > Only offering Context.run(callable) provides a similar "the only changes > to the execution context will be those this function, or a function it > called, explicitly initiates" protection for context variables, and Guido's > requested API simplifications make this aspect even easier to reason about: > after any given function call, you can be certain of being back in the > context you started in, because we wouldn't expose any Python level API > that allowed an execution context switch to persist beyond the frame that > initiated it. > And as long as you're not calling something that's a specific framework's API for messing with the context, that's a fine assumption. I just don't see the need to try to "enforce" this by hiding the underlying API. (Especially since I presume that at the C API level it will still be possible -- else how would Context.run() itself be implemented?) > ==== > > The above is my main rationale for preferring contextvars.Context.run() to > contextvars.set_ctx(), but it's not the only reason I prefer it. > > At a more abstract design philosophy level, I think the distinction > between symmetric and asymmetric coroutines is relevant here [2]: > > * in symmetric coroutines, there's a single operation that says "switch to > running this other coroutine" > * in asymmetric coroutines, there are separate operations for starting or > resuming coroutine and for suspending the currently running one > > Python's native coroutines are asymmetric - we don't provide a "switch to > this coroutine" primitive, we instead provide an API for starting or > resuming a coroutine (via cr.__next__(), cr.send() & cr.throw()), and an > API for suspending one (via await). > > The contextvars.set_ctx() API would be suitable for symmetric coroutines, > as there's no implied notion of parent context/child context, just a notion > of switching which context is active. > > The Context.run() API aligns better with asymmetric coroutines, as there's > a clear distinction between the parent frame (the one initiating the > context switch) and the child frame (the one running in the designated > context). > Sure. But a *framework* might build something different. > As a practical matter, Context.run also composes nicely (in combination > with functools.partial) for use with any existing API based on submitting > functions for delayed execution, or execution in another thread or process: > > - sched > - concurrent.futures > - arbitrary callback APIs > - method based protocols (including iteration) > > By contrast, "contextvars.set_ctx" would need various wrappers to handle > correctly reverting the context change, and would hence be prone to > "changed the active context without changing it back" bugs (which can be > especially fun when you're dealing with a shared pool of worker threads or > processes). > So let's have both. Cheers, > Nick. > > [1] Technically C extensions can play games with this via > PyThreadState_Swap, but I'm not going to worry about that here > [2] https://stackoverflow.com/questions/41891989/what-is- > the-difference-between-asymmetric-and-symmetric-coroutines > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > -- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com