[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes

2019-10-21 Thread Peter Ludemann


New submission from Peter Ludemann :

As far as I can tell, the lib2to3/Grammar.txt file in the Python 3.8 release is 
the same as that of the Python 3.7 release, which means it doesn't have the 
"walrus" operator and the "/" parameter syntax.

--
components: 2to3 (2.x to 3.x conversion tool)
messages: 355092
nosy: Peter Ludemann
priority: normal
severity: normal
status: open
title: lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue38551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36541] Make lib2to3 grammar more closely match Python

2019-10-21 Thread Peter Ludemann


Peter Ludemann  added the comment:

Re: breakage due to changes in structure 
(https://bugs.python.org/issue36541#msg339669) ... this has already happened in 
the past (e.g., type annotations and async). 

It's probably a good idea to add some documentation that structure changes can 
be expected with each release of Python.

--
nosy: +Peter Ludemann

___
Python tracker 
<https://bugs.python.org/issue36541>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes

2019-10-21 Thread Peter Ludemann


Peter Ludemann  added the comment:

Should I just close this? (I didn't find https://bugs.python.org/issue36541 
when I searched, possibly because I used "2to3" instead of "lib2to3" in my 
search.)

--

___
Python tracker 
<https://bugs.python.org/issue38551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36541] Make lib2to3 grammar more closely match Python

2019-10-23 Thread Peter Ludemann


Peter Ludemann  added the comment:

Also the Grammar.txt diffs look about the same size as I've seen with other 
upgrades to lib2to3 when the Python grammar changed.

--

___
Python tracker 
<https://bugs.python.org/issue36541>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes

2019-10-23 Thread Peter Ludemann


Peter Ludemann  added the comment:

issue36541 and its proposed PR seem to cover my needs.

--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39154] "utf8-sig" missing from codecs (inconsistency)

2019-12-29 Thread Peter Ludemann


New submission from Peter Ludemann :

In general, 'utf8' and 'utf-8' are interchangeable in the codecs (and in many 
parts of the Python library). However, 'utf8-sig' is missing ... and it happens 
to also be generated by lib2to3.tokenize.detect_encoding.

>>> import codecs
>>> codecs.getincrementaldecoder('utf-8-sig')()

>>> codecs.getincrementaldecoder('utf8-sig')()
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.6/codecs.py", line 987, in getincrementaldecoder
decoder = lookup(encoding).incrementaldecoder
LookupError: unknown encoding: utf8-sig

--
components: Unicode
messages: 358994
nosy: Peter Ludemann, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: "utf8-sig" missing from codecs (inconsistency)
type: behavior
versions: Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue39154>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3

2019-12-29 Thread Peter Ludemann


Peter Ludemann  added the comment:

lib2to3.tokenize should allow 'utf8' and 'utf-8' interchangeably, to be 
consistent with the rest of the Python library (I looked through the library 
source, and there seems to be no consistent preference, and also many (but not 
all) checks for 'utf-8' also check for 'utf8'). In particular, 
tokenize.detect_encoding should have code for both forms, as the encoding can 
be set by the user. Also, code should allow for 'UTF8' and 'UTF-8'.

See also https://bugs.python.org/issue39154

(This is probably a larger issue than just lib2to3, as a quick grep through 
/usr/lib/python3.7 showed; but not sure how to best address that.)

--
components: +2to3 (2.x to 3.x conversion tool) -Unicode
title: "utf8-sig" missing from codecs (inconsistency) -> "utf8" not always a 
synonym for "utf-8" in lib2to3

___
Python tracker 
<https://bugs.python.org/issue39154>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39155] "utf8-sig" missing from codecs (inconsistency)

2019-12-29 Thread Peter Ludemann


New submission from Peter Ludemann :

In general, 'utf8' and 'utf-8' are interchangeable in the codecs (and in many 
parts of the Python library). However, 'utf8-sig' is missing ... and it happens 
to also be generated by lib2to3.tokenize.detect_encoding.

>>> import codecs
>>> codecs.getincrementaldecoder('utf-8-sig')()

>>> codecs.getincrementaldecoder('utf8-sig')()
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.6/codecs.py", line 987, in getincrementaldecoder
decoder = lookup(encoding).incrementaldecoder
LookupError: unknown encoding: utf8-sig

--
components: Unicode
messages: 358996
nosy: Peter Ludemann, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: "utf8-sig" missing from codecs (inconsistency)
type: behavior
versions: Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue39155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3

2019-12-29 Thread Peter Ludemann


Peter Ludemann  added the comment:

(oops -- updated this bug instead of submitting a new one)
See also https://bugs.python.org/issue39155

--

___
Python tracker 
<https://bugs.python.org/issue39154>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3

2019-12-29 Thread Peter Ludemann


Peter Ludemann  added the comment:

To clarify and fix a typo ... lib2to3.pgen2.tokenize.detect_encoding checks for 
'utf-8'(and 'utf_8') but not 'utf8' in various places. Similarly for 'latin-1' 
and 'latin1'. (The codecs documentation page allows 'utf8' and 'latin1' as 
codecs.)

['UTF-8' is taken care of in _get_normal_name] 

See also https://bugs.python.org/issue39155

--

___
Python tracker 
<https://bugs.python.org/issue39154>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40360] Deprecate lib2to3 (and 2to3) for future removal

2020-04-27 Thread Peter Ludemann


Change by Peter Ludemann :


--
nosy: +Peter Ludemann

___
Python tracker 
<https://bugs.python.org/issue40360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40360] Deprecate lib2to3 (and 2to3) for future removal

2020-04-29 Thread Peter Ludemann


Peter Ludemann  added the comment:

The documentation change gives two possible successors:

https://libcst.readthedocs.io/ (https://github.com/Instagram/LibCST)
https://parso.readthedocs.io/

And I've also seen this mentioned: https://github.com/pyga/awpa

Is it possible to settle on one of these as the successor to the lib2to3 
parser? It would be nice to avoid a 2nd deprecation in the future ...

--

___
Python tracker 
<https://bugs.python.org/issue40360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36541] Make lib2to3 grammar better match Python, support the := walrus

2020-12-04 Thread Peter Ludemann


Peter Ludemann  added the comment:

I made a suggestion for augmenting ast.parse with some of lib2to3's features; 
but nobody seemed interested. 

RIP lib2to3. Like many pieces of software, it was used for far more than for 
what it was originally intended.

https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/

--

___
Python tracker 
<https://bugs.python.org/issue36541>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36541] Make lib2to3 grammar better match Python, support the := walrus

2020-12-06 Thread Peter Ludemann


Peter Ludemann  added the comment:

Every piece of code that uses either lib2to3 or a parser derived from it 
(including parso and LibCST) will eventually not be able to upgrade the parser 
because PEG can handle grammars that LL(k) can't. That's why I proposed adding 
some functionality to ast.parse, to make the whitespace and token information 
easily available - this seems to be what @BTaskaya says is "easy" (maybe they 
mean it's easy using LibCST? It seems to be fiddly using ast.parse). The 
alternative is that all these projects (black, LibCST, yapf, etc.) will have to 
roll their own solutions, which doesn't seem a very productive use of people's 
time and makes version upgrades slow.

If people are interested in using ast.parse extensions as a replacement for 
lib2to3, I suggest discussing at 
https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/

--

___
Python tracker 
<https://bugs.python.org/issue36541>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40360] Deprecate lib2to3 (and 2to3) for future removal

2020-07-06 Thread Peter Ludemann


Peter Ludemann  added the comment:

Looking at the suggested successor tools (redbaron, libCST, parso, awpa) ... 
all of them appear to use some variant of pgen2. But at some point Python will 
be using a PEG approach (PEP 617), and therefor the pgen2 approach apparently 
won't work.

For a number of projects, it's important to have a parse tree that contains all 
the "whitespace" information (indent, dedent, comment, newline, etc.) As far as 
I can tell, the new PEG parser won't provide that, and it seems that none of 
the successor tools will be able to handle future versions of Python syntax.

So, three questions:
1. Am I right that all proposed replacements (redbaron, libCST, parso, awpa) 
use some variation of the LL(1) and therefore will have trouble in the future?
2. Are there any plans (either part of the core development or as a project) 
for one of these replacements that is PEG-based? (Or a new project?)
3. Is Lib/ast.py going to continue being supported? (I infer that it will, with 
the change from LL(1) to PEG being mostly transparent - 
https://mail.python.org/archives/list/python-...@python.org/thread/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/#4D3B2NM2JMV2UKIT6EV5Q2A6XK2HXDEH
 )

If Lib/ast.py continues to be supported, I think I can see a way of providing 
functionality similar to lib2to3 (in terms of an AST-ish thing with 
"whitespace" from the source, sufficient for tools such as yapf, black, 
pykythe, pytype, mypy, etc.) as a kind of wrapper to ast.py. 
I suppose I should discuss this idea on python-dev? Is there an ongoing 
discussion? (I couldn't find any but might have been using the wrong search 
terms)

--

___
Python tracker 
<https://bugs.python.org/issue40360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40360] Deprecate lib2to3 (and 2to3) for future removal

2020-07-08 Thread Peter Ludemann


Peter Ludemann  added the comment:

I've written up a proposal for adding "whitespace" handling to the ast module:
https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/

I don't think it's a "summer-of-code-sized project", mainly because I already 
have various bits of code that handle the fiddly byte/str offset conversions.

--

___
Python tracker 
<https://bugs.python.org/issue40360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40360] Deprecate lib2to3 (and 2to3) for future removal

2020-07-08 Thread Peter Ludemann


Peter Ludemann  added the comment:

Yes, I'm thinking of doing this as a wrapper, in such a way that it could be 
incorporated into Lib/ast.py eventually. (Also, any lib2to3-ish capabilities 
would probably not be suitable for inclusion in the stdlib, at least not 
initially ... but I have no plans to work on something to replace lib2to3's 
fixers.)

--

___
Python tracker 
<https://bugs.python.org/issue40360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com