"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > Have you looked at Py3k at all, especially PEP 3116 (new I/O)?
No. > Python *does* have its own I/O model. There are binary files and text > files. For binary files, you write bytes and the semantic model is > that of an array of bytes; byte indices are seek positions. That is the same model as C and Unix. It is text files that we are discussing. > For text files, the contents is considered to be Unicode, encoded as > bytes in a binary file. So text file always has an underlying binary > file. Two translations take place, both of which have defaults varying > by platform. One translation is encoding Unicode text into bytes upon > output, and decoding bytes to Unicode text upon input. This can use > any encoding supported by the encodings package. The character code isn't the issue here, and is almost completely irrelevant. > The other translation deals with line endings. Upon input, any of > \r\n, \r, or \n is translated to a single \n by default (this is nhe > "universal newlines" algorithm from Python 2.x). This can be tweaked > or disabled. Upon output, \n is translated into a platform specific > string chosen from \r\n, \r, or \n. This can also be disabled or > overridden. Note that \r, when written, is never treated specially; if > you want special processing for \r on output, you can write your own > translation layer. Grrk. That's the problem. You don't get back what you have written, for a start, which isn't nice. There are other issues, too. > That's all. There is nothing unimplementable or confusing in these > specifications. Nothing unimplementable, I agree. Nothing confusing? Not in the experience of the users I have dealt with. > Python doesn't care about record I/O on legacy OSes; it does care > about variability found in practice between popular OSes. As a short-term solution, that is fine. But I have seen the wheel turn a couple of times in 40 years, and expect it to continue after I am safely 6' under .... > Note that \r, \n and friends in Python 3000 are either ASCII (in bytes > literals) or Unicode (in text literals). Again, no support for legacy > systems that don't use ASCII or a superset. That's not a problem. I don't see that changing in the forseeable future. > Legacy OSes are called that for a reason. Well, I remember when the text I/O model that C, Unix and Python use WAS a feature of legacy OSs :-) Seriously. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761 Fax: +44 1223 334679 _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com