[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

Nathaniel Smith Mon, 19 Apr 2021 18:54:56 -0700

On Mon, Apr 5, 2021 at 9:48 AM Irit Katriel <iritkatr...@googlemail.com> wrote:
> On Mon, Apr 5, 2021 at 11:01 AM Nathaniel Smith <n...@pobox.com> wrote:
>> - I'm uncomfortable with how in some contexts we treat EG's as placeholders 
>> for the contained exceptions, and other places we treat them like a single 
>> first-class exceptions. (Witness all the feedback about "why not just catch 
>> the ExceptionGroup and handle it by hand?", and imagine writing the docs 
>> explaining all the situations where that is or isn't a good idea and the 
>> pitfalls involved...) If we could somehow pick one and stick to it then I 
>> think that would make it easier for users to grasp. (My gut feeling is that 
>> making them pure containers is better, which to me is why it makes sense for 
>> them to be @final and why I keep hoping we can figure out some better way 
>> for plain 'except' and EGs to interact.)
>
>
> I'm not sure what you mean by "a placeholder". An exception group is an 
> exception, and except* is a new syntax that helps you manipulate exception 
> groups.  I don't think the confusion you mention was due to the fact that 
> except can catch EGs, but rather due to the fact that the simplest examples 
> don't show you why except is not good enough, and why we need except*.
>
> Suppose we did as you suggest and made exception group a container which is 
> not exception. Now suppose that an exception is raised in an except* block. 
> What is the context of this exception? Do we change the requirement that an 
> exception's context is always an exception?
>
> How do you raise an exception group if it's not an Exception? Do we go back 
> to allowing random objects being raised?


Ah, right. I'm talking about the conceptual/semantic level, not the
implementation. For the implementation, backcompat certainly means we
need some single object that can represent the whole group of
exceptions, so that it can be returned by sys.exc_info(), C code can
access it through the tstate, etc. But the more important question is
how we explain this thing to users, and make it easy for them to do
the right thing. Is the EG the exception they're trying to catch or
otherwise work with? or is the EG just a detail that most users should
ignore because they're focused on the leaf exceptions inside?

For the concurrency use case (asyncio/trio/etc.), it's all about the
leaf exceptions. ExceptionGroup([SomeError()]) and SomeError() both
mean exactly the same thing, and treating them differently is
basically always a bug. The ExceptionGroup object doesn't tell you
anything about what went wrong, like normal exceptions do. It just
tells you that there was some frame in between where the exception was
raised and where it was caught where some other exception *could* have
happened, but didn't. You *can* give 'except ExceptionGroup' a
meaning, but that meaning doesn't make sense -- it's like saying "I
want to catch any exceptions that was raised by code that *could* have
raised a different exception instead" or "I want to catch all
exceptions whose traceback contains an entry that's a certain subclass
of TracebackType". Similarly, it doesn't make sense to attach error
strings to an EG, or define custom subclasses, etc.. Ideally, plain
'except' would (somehow) handle 'EG([SomeError()])' and 'SomeError()'
in exactly the same way; and if it doesn't, then using plain 'except'
in concurrent code is going to usually be a bug.

For other use cases, it does make sense to think of EG as a regular
exception. Like, if Hypothesis wants to report that it ran some tests
and there were failures, then modelling that as a single
HypothesisError seems like a nice API. You never need 'except*'
semantics to catch part of a HypothesisError. 'except HypothesisError'
is totally sensible. The only real value you get from the PEP is that
if you *don't* catch the HypothesisError, then the default traceback
machinery can now automatically include info about the individual test
failures.

If we were starting from scratch, I don't think this would be a big
conflict. I think the PEP would focus 100% on the concurrency case.
That's the one that's really difficult to solve without language
support. And, if you solve that, then there's a very natural way to
handle the Hypothesis case too: do 'raise HypothesisError from
EG(individual failures)'. That makes it explicit that HypothesisError
is a single exception that should be handled as a unit, while still
letting you introspect the individual exceptions and including them in
the default traceback output. (In fact Trio uses this trick right now,
in it's TCP connection code [1]. We report a single
OSError("connection failed"), but then include all the details about
each individual failure in its __cause__.)

The actual problem is that we're not starting from scratch; we're
trying to retrofit the concurrency-style semantics onto a system
that's currently completely focused around the Hypothesis-style
semantics. So the PEP ends up with a kind of mish-mash of the two
approaches -- 'except*' gives you the semantics that concurrency use
cases want, 'except' gives you the semantics that Hypothesis-style use
cases want, and users are forced to understand all these subtleties
and figure out which one is appropriate on a case-by-case basis. It's
not very "one obvious way to do it".

And again -- I'm *not* saying the PEP sucks or anything like that.
Backcompat is a real problem, and the PEP might already be the best it
can be given the tradeoffs. I'm just explaining why I want to make
sure we consider all the alternatives.

[1] 
https://github.com/python-trio/trio/blob/cb06242a78d850baaba4057f7dbd22d2a21561f7/trio/_highlevel_open_tcp_stream.py#L361-L367

>> - If a function wants to start using concurrency internally, then now *all* 
>> its exceptions have to get wrapped in EGs and callers have to change *all* 
>> their exception handling code to use except* or similar. You would think 
>> this was an internal implementation detail that the caller shouldn't have to 
>> care about, but instead it forces a major change on the function's public 
>> API. And this is because regular 'except' can't do anything useful with EGs.
>
> I don't understand this point. If you are using concurrency internally and 
> don't want to raise EGs externally, then surely you will catch EGs, select 
> one of the exceptions to raise and throw away all the others like in the old 
> days when there weren't EGs (which is presumably what your caller is 
> expecting).  In other words, if you make drastic changes in a function's 
> implementation, it is your responsibility to ensure that the old API is 
> preserved.

The problem is that most of the time, even if you're using concurrency
internally so multiple things *could* go wrong at once, only one thing
actually *does* go wrong. So it's unfortunate if some existing code is
prepared for a specific exception that it knows can be raised, that
exact exception is raised... and the existing code fails to catch it
because now it's wrapped in an EG.

>> == Most important comment ==
>>
>> Flat ExceptionGroups: there were two basic design approaches we discussed 
>> last year, which I'll call "flat" vs "nested". The current PEP uses the 
>> nested design, where ExceptionGroups form a tree, and traceback information 
>> is distributed in pieces over this tree. This is the source of a lot of the 
>> complexity in the current PEP: for example, it's why EG's don't have one 
>> obvious iteration semantics, and it's why once an exception is wrapped in an 
>> EG, it can never be unwrapped again (because it would lose traceback 
>> information).
>>
>> The idea of the "flat" design is to instead store all the traceback info 
>> directly on the leaf exceptions, so the EG itself can be just a pure 
>> container holding a list of exceptions, that's it, with no nesting. The 
>> downside is that it requires changes to the interpreter's code for updating 
>> __traceback__ attributes, which is currently hard-coded to only update one 
>> __traceback__ at a time.
>>
>> For a third-party library like Trio, changing the interpreter is obviously 
>> impossible, so we never considered it seriously. But in a PEP, changing the 
>> interpreter is possible. And now I'm worried that we ruled out a better 
>> solution early on for reasons that no longer apply. The more I think about 
>> it, the more I suspect that flat EGs would end up being substantially 
>> simpler all around? So I think we should at least think through what that 
>> would look like (and Irit, I'd love your thoughts here now that you're the 
>> expert on the CPython details!), and document an explicit decision one way 
>> or another. (Maybe we should do a call or something to go over the details? 
>> I'm trying to keep this email from ballooning out of control...)
>
> A Flat EG is in essence a "denormalized EG" - the tracebacks don't share 
> parts, each leaf exception has its own linked list of frames as its traceback.

Yes! "Denormalized" is a perfect word for what I'm talking about :-).

> It is easy enough to write a denormalize() function in traceback.py that 
> constructs this from the current EG structure, if you need it (use the 
> leaf_generator from the PEP). I'm not sure I see why we should trouble the 
> interpreter with this.

In the current design, once an exception is wrapped in an EG, then it
can never be unwrapped, because its traceback information is spread
across the individual exception + the EG tree around it. This is
confusing to users ("how do I check errno?"), and makes the design
more complicated (the need for topology-preserving .split(), the
inability to define a sensible EG.__iter__, ...). The advantage of
making the denormalized form the native form is that now the leaf
exceptions would be self-contained objects like they are now, so you
don't need EG nesting at all, and users can write intuitive code like:

except OSError as *excs:
    remainder = [exc for exc in excs if exc.errno != ...]
    if remainder:
        raise ExceptionGroup(remainder)

> For display purposes, it is probably nicer to look at a normalized traceback 
> where common parts are not repeated.

Yeah, I agree; display code would want to re-normalize before
printing. But now it's only the display code that needs to care about
figuring out shared parts of the traceback, rather than something that
has to be maintained as an invariant everywhere.

>> - In my original proposal, EGs didn't just hold a list of exceptions, but 
>> also a list of "origins" for each exception. The idea being that if, say, 
>> you attempt to connect to a host with an IPv4 address and an IPv6 address, 
>> and they raised two different OSErrors that got bundled together into one 
>> EG, then it would be nice to know which OSError came from which attempt. Or 
>> in asyncio/trio it would be nice if tracebacks could show which task each 
>> exception came from. It seems like this got dropped at some point?
>
> I don't remember seeing this, so it wasn't deliberately dropped.   It's not 
> that we copied MultiError, or your sketches for MultiError2, and then 
> modified them into what is now in the PEP. We learned from your experience, 
> particularly the MutliError2 document which spells out the aspects of 
> MultiError that didn't work well. But we designed ExceptionGroup pretty much 
> from first principles (and then redesigned it several times as we learned our 
> own lessons) . So you shouldn't assume that anything we didn't take from 
> MultiError or the MultiError2 sketch was deliberately dropped. We may have 
> just not paid attention to all of the details.

Huh! Then it's interesting that after all that you ended up with
almost exactly the same design as MultiError v2 :-).

>>   On further consideration, I think this might be better handled as a 
>> special kind of traceback entry that we can attach to each exception, that 
>> just holds some arbitrary text that's inserted into the traceback at the 
>> appropriate place? But either way, I think it would be good to be able to 
>> attach this kind of information to tracebacks somehow.
>
> It sounds like you want some way to enrich exceptions. This should be 
> optional (we wouldn't want EG to require additional metadata for exceptions) 
> so yeah, I agree it should sit on the leaf exception and not on the group. In 
> that sense it's orthogonal to this PEP.

Well, the extra metadata would specifically be at "join" points in the
traceback, which are a thing that this PEP is creating :-). And it's
useful for every user of EGs, since by definition, an EG is
multiplexing exceptions from multiple sources, so it's nice to tell
the user which sources those are.

That said, you're right, if we want to handle this by defining a new
kind of traceback entry that code like Trio/asyncio/hypothesis can
manually attach to exceptions, then that could be written as a
separate-but-complementary PEP. In my original design, instead of
defining a new kind of traceback entry, I was storing this on the EG
itself, so that's why I was thinking about it needing to be part of
this PEP.

>> - Recording pre-empted exceptions: This is another type of metadata that 
>> would be useful to print along with the traceback. It's non-obvious and a 
>> bit hard to explain, but multiple trio users have complained about this, so 
>> I assume it will bite asyncio users too as soon as TaskGroups are added. The 
>> situation is, you have a parent task P and two child tasks C1 and C2:
>>
>>         P
>>        / \
>>       C1  C2
>>
>>   C1 terminates with an unhandled exception E1, so in order to continue 
>> unwinding, the nursery/taskgroup in P cancels C2. But, C2 was itself in the 
>> middle of unwinding another, different exception E2 (so e.g. the 
>> cancellation arrived during a `finally` block). E2 gets replaced with a 
>> `Cancelled` exception whose __context__=E2, and that exception unwinds out 
>> of C2 and the nursery/taskgroup in P catches the `Cancelled` and discards 
>> it, then re-raises E1 so it can continue unwinding.
>>
>>   The problem here is that E2 gets "lost" -- there's no record of it in the 
>> final output. Basically E1 replaced it. And that can be bad: for example, if 
>> the two children are interacting with each other, then E2 might be the 
>> actual error that broke the program, and E1 is some exception complaining 
>> that the connection to C2 was lost. If you have two exceptions that are 
>> triggered from the same underlying event, it's a race which one survives.
>>
>>   This is conceptually similar to the way an exception in an 'except' block 
>> used to cause exceptions to be lost, so we added __context__ to avoid that. 
>> And just like for __context__, it would be nice if we could attach some info 
>> to E1 recording that E2 had happened and then got preempted. But I don't see 
>> how we can reuse __context__ itself for this, because it's a somewhat 
>> different relationship: __context__ means that an exception happened in the 
>> handler for another exception, while in this case you might have multiple 
>> preempted exceptions, and they're associated with particular points in the 
>> stack trace where the preemption occurred.
>>
>>   This is a complex issue and maybe we should call it out-of-scope for the 
>> first version of ExceptionGroups. But I mention it because it's a second 
>> place where adding some extra annotations to the traceback info would be 
>> useful, and maybe we can keep it simple by adding some minimal hooks in the 
>> core traceback machinery and let libraries like trio/asyncio handle the 
>> complicated parts?
>
> Isn't this something that Trio/asyncio should handle, as in: if you catch  
> Cancelled that has a __context__ then you need to put that __context__ into 
> the exception group you are raising so that it's not lost?

It's something that Trio/asyncio would handle manually, yeah, but
Trio/asyncio need somewhere to put the metadata, that the traceback
printing code will look at :-). Something like a __preempted__ field
on exceptions, or another kind of extended traceback entry ("when
passing through this frame, this exception preempted another
exception").

-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZDQPOIP5GS4NGREWUGXWX7ZCE4NKQYSD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

Reply via email to