[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes
New submission from Peter Ludemann : As far as I can tell, the lib2to3/Grammar.txt file in the Python 3.8 release is the same as that of the Python 3.7 release, which means it doesn't have the "walrus" operator and the "/" parameter syntax. -- components: 2to3 (2.x to 3.x conversion tool) messages: 355092 nosy: Peter Ludemann priority: normal severity: normal status: open title: lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes type: behavior versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue38551> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36541] Make lib2to3 grammar more closely match Python
Peter Ludemann added the comment: Re: breakage due to changes in structure (https://bugs.python.org/issue36541#msg339669) ... this has already happened in the past (e.g., type annotations and async). It's probably a good idea to add some documentation that structure changes can be expected with each release of Python. -- nosy: +Peter Ludemann ___ Python tracker <https://bugs.python.org/issue36541> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes
Peter Ludemann added the comment: Should I just close this? (I didn't find https://bugs.python.org/issue36541 when I searched, possibly because I used "2to3" instead of "lib2to3" in my search.) -- ___ Python tracker <https://bugs.python.org/issue38551> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36541] Make lib2to3 grammar more closely match Python
Peter Ludemann added the comment: Also the Grammar.txt diffs look about the same size as I've seen with other upgrades to lib2to3 when the Python grammar changed. -- ___ Python tracker <https://bugs.python.org/issue36541> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38551] lib2to3 Grammar.txt doesn't have Python 3.8 grammar changes
Peter Ludemann added the comment: issue36541 and its proposed PR seem to cover my needs. -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue38551> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39154] "utf8-sig" missing from codecs (inconsistency)
New submission from Peter Ludemann : In general, 'utf8' and 'utf-8' are interchangeable in the codecs (and in many parts of the Python library). However, 'utf8-sig' is missing ... and it happens to also be generated by lib2to3.tokenize.detect_encoding. >>> import codecs >>> codecs.getincrementaldecoder('utf-8-sig')() >>> codecs.getincrementaldecoder('utf8-sig')() Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.6/codecs.py", line 987, in getincrementaldecoder decoder = lookup(encoding).incrementaldecoder LookupError: unknown encoding: utf8-sig -- components: Unicode messages: 358994 nosy: Peter Ludemann, ezio.melotti, vstinner priority: normal severity: normal status: open title: "utf8-sig" missing from codecs (inconsistency) type: behavior versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue39154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3
Peter Ludemann added the comment: lib2to3.tokenize should allow 'utf8' and 'utf-8' interchangeably, to be consistent with the rest of the Python library (I looked through the library source, and there seems to be no consistent preference, and also many (but not all) checks for 'utf-8' also check for 'utf8'). In particular, tokenize.detect_encoding should have code for both forms, as the encoding can be set by the user. Also, code should allow for 'UTF8' and 'UTF-8'. See also https://bugs.python.org/issue39154 (This is probably a larger issue than just lib2to3, as a quick grep through /usr/lib/python3.7 showed; but not sure how to best address that.) -- components: +2to3 (2.x to 3.x conversion tool) -Unicode title: "utf8-sig" missing from codecs (inconsistency) -> "utf8" not always a synonym for "utf-8" in lib2to3 ___ Python tracker <https://bugs.python.org/issue39154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39155] "utf8-sig" missing from codecs (inconsistency)
New submission from Peter Ludemann : In general, 'utf8' and 'utf-8' are interchangeable in the codecs (and in many parts of the Python library). However, 'utf8-sig' is missing ... and it happens to also be generated by lib2to3.tokenize.detect_encoding. >>> import codecs >>> codecs.getincrementaldecoder('utf-8-sig')() >>> codecs.getincrementaldecoder('utf8-sig')() Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.6/codecs.py", line 987, in getincrementaldecoder decoder = lookup(encoding).incrementaldecoder LookupError: unknown encoding: utf8-sig -- components: Unicode messages: 358996 nosy: Peter Ludemann, ezio.melotti, vstinner priority: normal severity: normal status: open title: "utf8-sig" missing from codecs (inconsistency) type: behavior versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue39155> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3
Peter Ludemann added the comment: (oops -- updated this bug instead of submitting a new one) See also https://bugs.python.org/issue39155 -- ___ Python tracker <https://bugs.python.org/issue39154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39154] "utf8" not always a synonym for "utf-8" in lib2to3
Peter Ludemann added the comment: To clarify and fix a typo ... lib2to3.pgen2.tokenize.detect_encoding checks for 'utf-8'(and 'utf_8') but not 'utf8' in various places. Similarly for 'latin-1' and 'latin1'. (The codecs documentation page allows 'utf8' and 'latin1' as codecs.) ['UTF-8' is taken care of in _get_normal_name] See also https://bugs.python.org/issue39155 -- ___ Python tracker <https://bugs.python.org/issue39154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Change by Peter Ludemann : -- nosy: +Peter Ludemann ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Peter Ludemann added the comment: The documentation change gives two possible successors: https://libcst.readthedocs.io/ (https://github.com/Instagram/LibCST) https://parso.readthedocs.io/ And I've also seen this mentioned: https://github.com/pyga/awpa Is it possible to settle on one of these as the successor to the lib2to3 parser? It would be nice to avoid a 2nd deprecation in the future ... -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36541] Make lib2to3 grammar better match Python, support the := walrus
Peter Ludemann added the comment: I made a suggestion for augmenting ast.parse with some of lib2to3's features; but nobody seemed interested. RIP lib2to3. Like many pieces of software, it was used for far more than for what it was originally intended. https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/ -- ___ Python tracker <https://bugs.python.org/issue36541> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36541] Make lib2to3 grammar better match Python, support the := walrus
Peter Ludemann added the comment: Every piece of code that uses either lib2to3 or a parser derived from it (including parso and LibCST) will eventually not be able to upgrade the parser because PEG can handle grammars that LL(k) can't. That's why I proposed adding some functionality to ast.parse, to make the whitespace and token information easily available - this seems to be what @BTaskaya says is "easy" (maybe they mean it's easy using LibCST? It seems to be fiddly using ast.parse). The alternative is that all these projects (black, LibCST, yapf, etc.) will have to roll their own solutions, which doesn't seem a very productive use of people's time and makes version upgrades slow. If people are interested in using ast.parse extensions as a replacement for lib2to3, I suggest discussing at https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/ -- ___ Python tracker <https://bugs.python.org/issue36541> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Peter Ludemann added the comment: Looking at the suggested successor tools (redbaron, libCST, parso, awpa) ... all of them appear to use some variant of pgen2. But at some point Python will be using a PEG approach (PEP 617), and therefor the pgen2 approach apparently won't work. For a number of projects, it's important to have a parse tree that contains all the "whitespace" information (indent, dedent, comment, newline, etc.) As far as I can tell, the new PEG parser won't provide that, and it seems that none of the successor tools will be able to handle future versions of Python syntax. So, three questions: 1. Am I right that all proposed replacements (redbaron, libCST, parso, awpa) use some variation of the LL(1) and therefore will have trouble in the future? 2. Are there any plans (either part of the core development or as a project) for one of these replacements that is PEG-based? (Or a new project?) 3. Is Lib/ast.py going to continue being supported? (I infer that it will, with the change from LL(1) to PEG being mostly transparent - https://mail.python.org/archives/list/python-...@python.org/thread/HOZ2RI3FXUEMAT4XAX4UHFN4PKG5J5GR/#4D3B2NM2JMV2UKIT6EV5Q2A6XK2HXDEH ) If Lib/ast.py continues to be supported, I think I can see a way of providing functionality similar to lib2to3 (in terms of an AST-ish thing with "whitespace" from the source, sufficient for tools such as yapf, black, pykythe, pytype, mypy, etc.) as a kind of wrapper to ast.py. I suppose I should discuss this idea on python-dev? Is there an ongoing discussion? (I couldn't find any but might have been using the wrong search terms) -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Peter Ludemann added the comment: I've written up a proposal for adding "whitespace" handling to the ast module: https://mail.python.org/archives/list/python-id...@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/ I don't think it's a "summer-of-code-sized project", mainly because I already have various bits of code that handle the fiddly byte/str offset conversions. -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Peter Ludemann added the comment: Yes, I'm thinking of doing this as a wrapper, in such a way that it could be incorporated into Lib/ast.py eventually. (Also, any lib2to3-ish capabilities would probably not be suitable for inclusion in the stdlib, at least not initially ... but I have no plans to work on something to replace lib2to3's fixers.) -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com