On Fri, Jan 10, 2014 at 2:03 AM, Joao S. O. Bueno <jsbu...@python.org.br> wrote: > On 9 January 2014 04:50, Lennart Regebro <rege...@gmail.com> wrote: >> To be honest, you can define text as "A stream of bytes that are split >> up in lines separated by a linefeed", and do some basic text >> processing like that. Just very *basic*, but still. Replacing >> characters. Extracting certain lines etc. > > That is, until you hit a character which has a byte with the same > value of ASCII newline in the middle of a multi-byte character. > > So, this approach is broken to start with.
For a very specific definition of broken, yes, namely that it will fail with UTF-16 or EBCDIC. Files that with the above definition of "text files" are not text files. :-) //Lennart _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com