On 28 September 2010 12:29, Michael Foord <fuzzy...@voidspace.org.uk> wrote:
> On 28/09/2010 12:19, Antoine Pitrou wrote: > >> On Mon, 27 Sep 2010 23:45:45 -0400 >> Steve Holden<st...@holdenweb.com> wrote: >> >>> On 9/27/2010 11:27 PM, Benjamin Peterson wrote: >>> >>>> 2010/9/27 Meador Inge<mead...@gmail.com>: >>>> >>>>> which, as seen in the trace, is because the 'detect_encoding' function >>>>> in >>>>> 'Lib/tokenize.py' searches for 'BOM_UTF8' (a 'bytes' object) in the >>>>> string >>>>> to tokenize 'first' (a 'str' object). It seems to me that strings >>>>> should >>>>> still be able to be tokenized, but maybe I am missing something. >>>>> Is the implementation of 'detect_encoding' correct in how it attempts >>>>> to >>>>> determine an encoding or should I open an issue for this? >>>>> >>>> Tokenize only works on bytes. You can open a feature request if you >>>> desire. >>>> >>>> Working only on bytes does seem rather perverse. >>> >> I agree, the morality of bytes objects could have been better :) >> >> The reason for working with bytes is that source data can only be > correctly decoded to text once the encoding is known. The encoding is > determined by reading the encoding cookie. > > I certainly wouldn't be opposed to an API that accepts a string as well > though. > > Ah, and to explain the design decision when tokenize was ported to py3k - the Python 2 APIs take the readline method of a file object (not a string). http://docs.python.org/library/tokenize.html For this to work correctly in Python 3 it *has* to be a file object open in binary read mode in order to decode the source code correctly. A new API that takes a string would certainly be nice. The Python 2 API for tokenize is 'interesting'... All the best, Michael Foord > All the best, > > Michael > > > >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev@python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk >> > > > -- > http://www.ironpythoninaction.com/ > http://www.voidspace.org.uk/blog > > READ CAREFULLY. By accepting and reading this email you agree, on behalf of > your employer, to release me from all obligations and waivers arising from > any and all NON-NEGOTIATED agreements, licenses, terms-of-service, > shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, > non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have > entered into with your employer, its partners, licensors, agents and > assigns, in perpetuity, without prejudice to my ongoing rights and > privileges. You further represent that you have the authority to release me > from any BOGUS AGREEMENTS on behalf of your employer. > > > -- http://www.voidspace.org.uk
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com