n 8/10/2019 7:46 PM, Glenn Linderman wrote:
Because of the "invalid escape sequence" and "raw string" discussion,
when looking at the documentation, I also noticed the following
description for f-strings:
Escape sequences are decoded like in ordinary string literals (except
when a literal is also marked as a raw string). After decoding, the
grammar for the contents of the string is:
followed by lots of stuff, followed by
Backslashes are not allowed in format expressions and will raise an
error:
f"newline: {ord('\n')}" # raises SyntaxError
What I don't understand is how, if f-strings are processed AS
DESCRIBED, how the \n is ever seen by the format expression.
If I recall correctly, the mentioned decoding is happening on the string
literal parts of the f-strings (above, the "newline: " part), not the
expression parts (inside the {}). But it's been a while and I don't
recall all of the details.
The description is that they are first decoded like ordinary strings,
and then parsed for the internal grammar containing {} expressions to
be expanded. If that were true, the \n in the above example would
already be a newline character, and the parsing of the format
expression would not see the backslash. And if it were true, that
would actually be far more useful for this situation.
So given that it is not true, why not? And why go to the extra work of
prohibiting \ in the format expressions?
It's a future-proofing thing. See the discussion at
https://mail.python.org/archives/list/python-dev@python.org/thread/EVXD72IYUN2APF2443OMADKA5WJTOKHD/
It has pointers to other parts of the discussion.
At some point, I'm planning on switching the parsing of f-strings from
the custom parser (see Python/ast.c, FstringParser_ConcatFstring()) to
having the python parser itself parse the f-strings. This will be
similar to PEP 536, which doesn't have much detail, but does describe
some of the motivations.
The PEP 498, of course, has an apparently more accurate description,
that the {} parsing actually happens before the escape processing.
Perhaps this avoids making multiple passes over the string to do the
work, as the literal pieces and format expression pieces have to be
separate in the generated code, but that is just my speculation: I'd
like to know the real reason.
Should the documentation be fixed to make the description more
accurate? If so, I'd be glad to open an issue.
Sure. I'm always in favor of accuracy. The f-string documentation was a
last-minute rush job that could have used a lot more editing, and more
eyes are always welcome.
But it will take a fair amount of research to understand it well enough
to document it in more detail.
The PEP further contains the inaccurate statement:
Like all raw strings in Python, no escape processing is done for raw
f-strings:
not mentioning the actual escape processing that is done for raw
strings, regarding \" and \'.
It should probably just say it uses the same rules as raw strings.
Eric
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/FKNEBB5HTMRX4RWLPTZN5K2WRZ5W7MI6/