[issue30377] Unnecessary complexity in tokenize.py around handling of comments and newlines
New submission from Albert-Jan Nijburg: While porting tokenize.py to javascript I stumbled upon this. The bit of code that checks if it's a newline or a comment, checks for comment twice. These can be split up, this way the code is a bit more readable. https://github.com/python/cpython/blob/master/Lib/tokenize.py#L560 It's not broken, it's just a bit more complex then it has to be. -- components: Library (Lib) messages: 293760 nosy: Albert-Jan Nijburg, meador.inge priority: normal severity: normal status: open title: Unnecessary complexity in tokenize.py around handling of comments and newlines type: enhancement versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue30377> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Changes by Albert-Jan Nijburg : -- pull_requests: +1699 ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30377] Unnecessary complexity in tokenize.py around handling of comments and newlines
Albert-Jan Nijburg added the comment: Oh yes you're right! I've updated the code on github. Even cleaner this way :). -- ___ Python tracker <http://bugs.python.org/issue30377> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: > I prefer to add tokenize tokens directly in Lib/token.py, and then get > COMMENT, NL and ENCODING using tok_name.index(). That would make more sense from a breaking change perspective, but we would step on the toes of anyone adding `COMMENT`, `NL`, or `ENCODING` to `token.h` because `token.py` is generated from that. It would also make much more sense to have them as fields on `token` if they are in `tok_name` in `token`. -- nosy: +Albert-Jan Nijburg ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: lib2to3 appears to have it's own token.py as well with NL and COMMENT withtout ENCODING... Lib/lib2to3/pgen2/token.py -- ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30377] Unnecessary complexity in tokenize.py around handling of comments and newlines
Albert-Jan Nijburg added the comment: I did yesterday, should be coming through today right? -- ___ Python tracker <http://bugs.python.org/issue30377> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30377] Unnecessary complexity in tokenize.py around handling of comments and newlines
Albert-Jan Nijburg added the comment: Still no CLA, I checked my username on the pdf, and it's correct, hope someone looks at it soon :) -- ___ Python tracker <http://bugs.python.org/issue30377> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: I've updated the PR and added the tokenize tokens to token.h and their names to tokenizer.c. This way they'll show up when you run token.py. The names will always be in tok_name and tokenizer.py will use those. Not breaking the public api and no longer modifying token.py when you import tokenizer.py. -- ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30455] Generate C code from token.py and not vice versa
Albert-Jan Nijburg added the comment: I think this covers all the changes from PR #1608. Looks a lot nicer too, building it every time from the make file. You may want to add to the docs that token.py is now the source of the tokens. -- ___ Python tracker <http://bugs.python.org/issue30455> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: Let me know if you want me to add/change anything about my PR :) I'm happy to do so. -- ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: I've updated token.rst and Misc/NEWS. Let me know if the wording is correct. -- ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Albert-Jan Nijburg added the comment: Aah! Oops I can fix later today. On Thu, 1 Jun 2017 at 18:08, STINNER Victor wrote: > > STINNER Victor added the comment: > > We got a bug report from Coverity: > > *** CID 1411801: Incorrect expression (MISSING_COMMA) > /Parser/tokenizer.c: 111 in () > 105 "OP", > 106 "AWAIT", > 107 "ASYNC", > 108 "", > 109 "COMMENT", > 110 "NL", > >>> CID 1411801: Incorrect expression (MISSING_COMMA) > >>> In the initialization of "_PyParser_TokenNames", a suspicious > concatenated string ""ENCODING"" is produced. > 111 "ENCODING" > 112 "" > 113 }; > 114 > 115 > 116 /* Create and initialize a new tok_state structure */ > > I missed this typo :-p > > -- > > ___ > Python tracker > <http://bugs.python.org/issue25324> > ___ > -- ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25324] Importing tokenize modifies token
Changes by Albert-Jan Nijburg : -- pull_requests: +1990 ___ Python tracker <http://bugs.python.org/issue25324> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7167] Smarter FTP passive mode
Albert-Jan Nijburg added the comment: I understand the standpoint that the server is configured incorrectly, and I see why this might not be the best solution to the problem. But not everyone owns the ftp server they're connecting to. And the `EPSV` command often doesn't include the ipaddress, so many ftp clients are now using `EPSV` to circumvent the problem of the ip address and the dns address not matching up. Would it not be sensible to give users the option to use just `EPSV` so people can connect to incorrectly configured ftp servers. Although this can't be a massive issue for people, because I'd expect this bug to be a bit more active. Curl for example defaults to EPSV and then falls back to PASV when it's not supported by the server. The ftp client in macos also defaults to EPSV. I'm not sugesting we do that, but it would be nice if we could tell the ftplib to use EPSV without it being a ipv6 address. In our specific situation, we have an ftp server that has a public and a private endpoint on different ip addresses, and the ftp server is configured to use the public ip address, but if we want to access in internally we need to use the internal host and ip address. This causes `ftplib` not to work, because the ips don't line up. We currently monkey patch ftplib to use `EPSV`, but it does state that you need to use EPSV when you connect with ipv6 it doesn't say you can't use it when you use ipv4. -- nosy: +Albert-Jan Nijburg ___ Python tracker <https://bugs.python.org/issue7167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com