[Python-Dev] Re: Critique of PEP 657 -- Include Fine Grained Error Locations in Tracebacks

Nathaniel Smith Mon, 17 May 2021 17:40:54 -0700

On Mon, May 17, 2021 at 6:18 AM Mark Shannon <m...@hotpy.org> wrote:
> 2. Repeated binary operations on the same line.
>
> A single location can also be clearer when all the code is on one line.
>
> i1 + i2 + s1
>
> PEP 657:
>
> i1 + i2 + s1
> ^^^^^^^^^^^^
>
> Using a single location:
>
> i1 + i2 + s1
>          ^


It's true this case is a bit confusing with the whole operation span
highlighted, but I'm not sure the single location version is much better. I
feel like a Really Good UI would like, highlight the two operands in
different colors or something, or at least underline the two separate items
whose type is incompatible separately:

TypeError: unsupported operand type(s) for +: 'int' + 'str':
i1 + i2 + s1
^^^^^^^   ~~

More generally, these error messages are the kind of thing where the UI can
always be tweaked to improve further, and those tweaks can make good use of
any rich source information that's available.

So, here's another option to consider:

- When parsing, assign each AST node a unique, deterministic id (e.g.
sequentially across the AST tree from top-to-bottom, left-to-right).
- For each bytecode offset, store the corresponding AST node id in an
lnotab-like table
- When displaying a traceback, we already need to go find and read the
original .py file to print source code at all. Re-parse it, and use the ids
to find the original AST node, in context with full structure. Let the
traceback formatter do whatever clever stuff it wants with this info.

Of course if the .py and .pyc files don't match, this might produce
gibberish. We already have that problem with showing source lines, but it
might be even more confusing if we get some random unrelated AST node. This
could be avoided by storing some kind of hash in the code object, so that
we can validate the .py file we find hasn't changed (sha512 if we're
feeling fancy, crc32 if we want to save space, either way is probably fine).

This would make traceback printing more expensive, but only if you want the
fancy features, and traceback printing is already expensive (it does file
I/O!). Usually by the time you're rendering a traceback it's more important
to optimize for human time than CPU time. It would take less memory than
PEP 657, and the same as Mark's proposal (both only track one extra integer
per bytecode offset). And it would allow for arbitrarily rich traceback
display.

(I guess in theory you could make this even cheaper by using it to replace
lnotab, instead of extending it. But I think keeping lnotab around is a
good idea, as a fallback for cases where you can't find the original source
but still want some hint at location information.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BUXFOSAEBXLIHH432PKBCXOGXUAHQIVP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Critique of PEP 657 -- Include Fine Grained Error Locations in Tracebacks

Reply via email to