Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread David Ripton
On 2009.04.30 18:21:03 +0200, "Martin v. Löwis" wrote: > Perhaps - the entire PEP is about Python 3 only. I don't know whether > PyGTK already works with 3.x. It does not. There is a bug in the Gnome tracker for it, and I believe some work has been done to start porting PyGObject, but it appears

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
>> If I pass a string with an embedded U+ to gtk, gtk will truncate >> the string, and stop rendering it at this character. This is worse than >> what it does for invalid UTF-8 sequences. Chances are fairly high that >> other C libraries will fail in the same way, in particular if they >> expec

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread glyph
On 04:07 pm, mar...@v.loewis.de wrote: Martin, if you're going to stick with the half-surrogate trick, would you mind adding a section to the PEP on "alternate encoding strategies", explaining why the NULL method was not selected? In the PEP process, it isn't my job to criticize competing pr

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread glyph
On 03:35 pm, mar...@v.loewis.de wrote: So, why do you prefer half surrogate coding to U+ quoting? If I pass a string with an embedded U+ to gtk, gtk will truncate the string, and stop rendering it at this character. This is worse than what it does for invalid UTF-8 sequences. Chances ar

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> Martin, if you're going to stick with the half-surrogate trick, would > you mind adding a section to the PEP on "alternate encoding strategies", > explaining why the NULL method was not selected? In the PEP process, it isn't my job to criticize competing proposals. Instead, proponents of competi

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Michael Urman
On Thu, Apr 30, 2009 at 09:42, Thomas Breuel wrote: > So, I don't see any reason to prefer your half surrogate quoting to the Mono > U+-based quoting.  Both seem to achieve the same goal with respect to > round tripping file names, displaying them, etc., but Mono quoting actually > results in

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> What's an analogous failure? Or, rather, why would a failure analogous > to the one I got when using System.IO.DirectoryInfo ever exist in > Python? > > > Mono.Unix uses an encoder and a decoder that knows about special quoting > rules. System.IO uses a different encoder and decode

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Paul Moore
2009/4/30 Thomas Breuel : > The analogous phenomenon will exist in Python with PEP 383.  Let's say I > have a C library with wide character interfaces and I pass it a unicode > string from Python.(*) [...] > (*) There's actually a second, sutble issue.  PEP 383 intends utf-8b only to > be used for

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread glyph
On 02:42 pm, tmb...@gmail.com wrote: So, why do you prefer half surrogate coding to U+ quoting? I have also been eagerly waiting for an answer to this question. I am afraid I have lost it somewhere in the storm of this thread :). Martin, if you're going to stick with the half-surrogate

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > What's an analogous failure? Or, rather, why would a failure analogous > to the one I got when using System.IO.DirectoryInfo ever exist in > Python? Mono.Unix uses an encoder and a decoder that knows about special quoting rules. System.IO uses a different encoder and decoder because it's a r

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread MRAB
Martin v. Löwis wrote: OK, so why not adopt the Mono solution in CPython? It seems to produce valid unicode strings, removing at least one issue with PEP 383. It also means that IronPython and CPython actually would be compatible. See my other message. The Mono solution may not be what you ex

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> This has nothing to do with how Mono quotes. The reason for this is > that Mono quotes at all and that the Mono developers decided not to > change System.IO to understand UNIX quoting. > > If Mono used PEP 383 quoting, this would fail the same way. > > And analogous failures will exist with

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > > > "The upshot to all this is that Mono.Unix and Mono.Unix.Native can list, > > access, and open all files on your filesystem, regardless of encoding." > > I think this is misleading. With Mono 2.0.1, I get This has nothing to do with how Mono quotes. The reason for this is that Mono quotes

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> OK, so why not adopt the Mono solution in CPython? It seems to produce > valid unicode strings, removing at least one issue with PEP 383. It > also means that IronPython and CPython actually would be compatible. See my other message. The Mono solution may not be what you expect it to be. Rega

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
>> Because in Python, we want to be able to access all files on disk. >> Neither Java nor Mono are capable of doing that. > > Java is not capable of doing that. Mono, as I keep pointing out, is. It > uses NULLs to escape invalid UNIX filenames. Please see: > > http://go-mono.com/docs/index.aspx

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > And then it goes on to say: "You won't be able to pass non-Unicode > filenames as command-line arguments."(*) Not only that, but you can't > reliably use such files with System.IO (whatever that is, but it > sounds pretty basic). This support is only available "within the > Mono.Unix and Mono

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> Why didn't you point to that discussion from the PEP 383? And why > didn't you point to Kowalczyk's message on encodings in Mono, Java, etc. > from the PEP? Because I assumed that readers of the PEP would know (and I'm sure many of them do - this has been *really* discussed over and over aga

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread R. David Murray
On Thu, 30 Apr 2009 at 11:26, gl...@divmod.com wrote: On 08:25 am, mar...@v.loewis.de wrote: > Why did you choose an incompatible approach for PEP 383? Because in Python, we want to be able to access all files on disk. Neither Java nor Mono are capable of doing that. Java is not capable of do

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > Java is not capable of doing that. Mono, as I keep pointing out, is. It > uses NULLs to escape invalid UNIX filenames. Please see: > > http://go-mono.com/docs/index.aspx?link=T%3AMono.Unix.UnixEncoding > > "The upshot to all this is that Mono.Unix and Mono.Unix.Native can list, > access, and

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread glyph
On 08:25 am, mar...@v.loewis.de wrote: Why did you choose an incompatible approach for PEP 383? Because in Python, we want to be able to access all files on disk. Neither Java nor Mono are capable of doing that. Java is not capable of doing that. Mono, as I keep pointing out, is. It uses N

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Paul Moore
2009/4/30 "Martin v. Löwis" : >> OK, so what's wrong with os.listdir() and similar functions returning a >> unicode string for strings that correctly encode/decode, and with byte >> strings for strings that are not valid unicode? > > See http://bugs.python.org/issue3187 > in particular msg71655 Ca

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
On Thu, Apr 30, 2009 at 12:32, "Martin v. Löwis" wrote: > > OK, so what's wrong with os.listdir() and similar functions returning a > > unicode string for strings that correctly encode/decode, and with byte > > strings for strings that are not valid unicode? > > See http://bugs.python.org/issue31

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> OK, so what's wrong with os.listdir() and similar functions returning a > unicode string for strings that correctly encode/decode, and with byte > strings for strings that are not valid unicode? See http://bugs.python.org/issue3187 in particular msg71655 Regards, Martin __

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > > Since both have had to deal with this, have you looked at what they > > actually do before proposing PEP 383? What did you find? > > See > > http://mail.python.org/pipermail/python-3000/2007-September/010450.html > Thanks, that's very useful. > > Why did you choose an incompatible approac

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Martin v. Löwis
> CPython and IronPython are incompatible. And they will stay > incompatible if the PEP is adopted. > > They would become compatible if CPython adopted Mono and/or Java > semantics. Which one should it adopt? Mono semantics, or Java semantics? > Since both have had to deal with this, have you

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Glenn Linderman
On approximately 4/29/2009 10:17 PM, came the following characters from the keyboard of Martin v. Löwis: I don't understand the proposal and issues. I see a lot of people claiming that they do, and then spending all their time either talking past each other, or disagreeing. If everyone who claim

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-30 Thread Thomas Breuel
> > Yes. Now think about the implications. This means that adopting PEP > > 383 will make IronPython and Jython running on UNIX intrinsically > > incompatible with CPython running on UNIX, and there's no way to fix > that. > > *Not* adapting the PEP will also make CPython and IronPython > incompa

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Martin v. Löwis
Thomas Breuel wrote: > On Thu, Apr 30, 2009 at 05:40, Curt Hagenlocher > wrote: > > IronPython will inherit whatever behavior Mono has implemented. The > Microsoft CLR defines the native string type as UTF-16 and all of the > managed APIs for things like f

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Martin v. Löwis
Jeroen Ruigrok van der Werven wrote: > -On [20090430 07:18], "Martin v. Löwis" (mar...@v.loewis.de) wrote: >> Suppose I create a new directory, and run the following script >> in 3.x: >> >> py> open("x","w").close() >> py> open(b"\xff","w").close() >> py> os.listdir(".") >> ['x'] > > That is actua

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Thomas Breuel
On Thu, Apr 30, 2009 at 05:40, Curt Hagenlocher wrote: > IronPython will inherit whatever behavior Mono has implemented. The > Microsoft CLR defines the native string type as UTF-16 and all of the > managed APIs for things like file names and environmental variables > operate on UTF-16 strings --

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Jeroen Ruigrok van der Werven
-On [20090430 07:18], "Martin v. Löwis" (mar...@v.loewis.de) wrote: >Suppose I create a new directory, and run the following script >in 3.x: > >py> open("x","w").close() >py> open(b"\xff","w").close() >py> os.listdir(".") >['x'] That is actually a regression in 3.x: Python 2.6.1 (r261:67515, Mar

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Martin v. Löwis
Curt Hagenlocher wrote: > On Wed, Apr 29, 2009 at 8:16 PM, Thomas Breuel wrote: >> Also, what are Jython and IronPython supposed to do on UNIX? Can they >> implement these semantics at all? > > IronPython will inherit whatever behavior Mono has implemented. The > Microsoft CLR defines the native

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Martin v. Löwis
> I don't understand the proposal and issues. I see a lot of people > claiming that they do, and then spending all their time either > talking past each other, or disagreeing. If everyone who claims they > understand the issues actually does, why is it so hard to reach a > consensus? Because t

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Steven D'Aprano
On Thu, 30 Apr 2009 01:16:20 pm Thomas Breuel wrote: > And that's why I think this proposal should be shelved for a while > until people have had more time to try to understand the issues and > also come up with alternative proposals.  Once this is adopted and > implemented in C-Python, Python is s

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Curt Hagenlocher
On Wed, Apr 29, 2009 at 8:16 PM, Thomas Breuel wrote: > > Also, what are Jython and IronPython supposed to do on UNIX?  Can they > implement these semantics at all? IronPython will inherit whatever behavior Mono has implemented. The Microsoft CLR defines the native string type as UTF-16 and all o

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Thomas Breuel
> > The whole purpose of PEP 383 is to send the exact same bytes that were > read from the OS back to the OS => violating (2) (for whatever the > apparent system file-encoding is, not limited to UTF-8), It's fine to read a file name from a file system and write the same file back as the same raw

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Martin v. Löwis
> The whole purpose of PEP 383 is to send the exact same bytes that were > read from the OS back to the OS => violating (2) (for whatever the > apparent system file-encoding is, not limited to UTF-8), and that has > overwhelmingly popular support. > > Note that this won't happen automatically, eit

[Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-29 Thread Stephen J. Turnbull
Thomas Breuel writes: > PEP 383 violated (2), and I think that's a bad thing. The whole purpose of PEP 383 is to send the exact same bytes that were read from the OS back to the OS => violating (2) (for whatever the apparent system file-encoding is, not limited to UTF-8), and that has overwhelmi

Re: [Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-28 Thread Zooko O'Whielacronx
On Apr 28, 2009, at 13:01 PM, Thomas Breuel wrote: (2) Should the default UTF-8 encoder for file system operations be allowed to generate illegal byte sequences? I think that's a definite no; if I set the encoding for a device to UTF-8, I never want Python to try to write illegal UTF-8 stri

[Python-Dev] a suggestion ... Re: PEP 383 (again)

2009-04-28 Thread Thomas Breuel
I think we should break up this problem into several parts: (1) Should the default UTF-8 decoder fail if it gets an illegal byte sequence. It's probably OK for the default decoder to be lenient in some way (see below). (2) Should the default UTF-8 encoder for file system operations be allowed to