[issue22221] ast.literal_eval confused by coding declarations

2014-08-17 Thread Jorgen Schäfer

New submission from Jorgen Schäfer:

The ast module seems to get confused for certain strings which contain coding 
declarations.

>>> import ast

>>> s = u'"""\\\n# -*- coding: utf-8 -*-\n"""'
>>> print s
"""\
# -*- coding: utf-8 -*-
"""
>>> ast.literal_eval(s)
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/forcer/Programs/Python/python2.7/lib/python2.7/ast.py", line 49, 
in literal_eval
node_or_string = parse(node_or_string, mode='eval')
  File "/home/forcer/Programs/Python/python2.7/lib/python2.7/ast.py", line 37, 
in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
  File "", line 0
SyntaxError: encoding declaration in Unicode string

--
components: Library (Lib)
messages: 225464
nosy: jorgenschaefer
priority: normal
severity: normal
status: open
title: ast.literal_eval confused by coding declarations
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue1>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22221] ast.literal_eval confused by coding declarations

2014-08-22 Thread Jorgen Schäfer

Jorgen Schäfer added the comment:

I do not understand how your comments apply to this bug. There is no
comment anywhere.  There is a single string literal whose contents look
like a comment. The expression parses correctly without syntax error if you
add a few newlines in front. Could you clarify your objection?
On Aug 22, 2014 9:59 PM, "Terry J. Reedy"  wrote:

>
> Terry J. Reedy added the comment:
>
> This issue is about the SyntaxError message for eval functions, not the
> ast module per se. My first response is that the reported message is not a
> bug and that this issue should be closed as 'not a bug'.
>
> (General reason) Trying to eval an expression preceded by a comment on its
> own line or followed by a comment works.
>
> >>> eval("#before\n'string'#after")
> 'string'
>
> Trying to eval a bare comment *is* a syntax error.
>
> >>> eval("#comment\n")
> ...
> SyntaxError: unexpected EOF while parsing
>
> So the issue as presented is the special-case message.  However, messages
> are not part of the language specification and improving them is
> often/usually/always? treated as an enhancement.  Changing them will break
> code and tests that depend on the exact wording. 2.7 does not get
> enhancements.
>
> (Specific reason) In 2.x, the input to (literal-)eval is either latin-1
> encoded bytes or unicode. 'Latin-1' input could potentially consist of an
> encoding declaration on one line followed on the next line by a literal
> string encoded as indicated.
>
> >>> le("# -*- coding: utf-8 -*-\n'string'")
> 'string'
>
> Unicode input, the subject of this issue, is encoded to latin-1, which
> means that any literal string in the expression has to be latin-1 encoded.
> Therefore, a latin-1 encoding declaration is redundant and anything else is
> either redundant (if the original unicode only contains characters that
> encode the same in latin-1, as in the example above) or wrong, with hard to
> predict behavior.  Someone thought it worthwhile to add the special case
> check.  I think it should be left as is.
>
> Jorgen, please either close this or explain why you think not, in light of
> the above.
>
> --
> nosy: +terry.reedy
>
> ___
> Python tracker 
> <http://bugs.python.org/issue1>
> ___
>

--

___
Python tracker 
<http://bugs.python.org/issue1>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com