Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-15 Thread Ulrich Eckhardt
On Friday 12 December 2008, Adam Olsen wrote: > Only pages like this, which indicate the underlying API is an array of > WCHAR: > > http://blogs.msdn.com/michkap/archive/2005/05/11/416552.aspx Hmm, true. So even there, the encoding isn't known... > char * is just fine. You need only pass a lengt

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-13 Thread Steven D'Aprano
On Fri, 12 Dec 2008 06:33:28 pm Toshio Kuratomi wrote: > Also interesting, if you point your browser at: > http://toshio.fedorapeople.org/u/ > > You should see two other test files. They're both > (one-half)(enyei).html but one's encoded in utf-8 and the other in > latin-1. For what it's worth

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote: > On Fri, Dec 12, 2008 at 9:47 PM, André Malo wrote: > > * Adam Olsen wrote: > >> On Fri, Dec 12, 2008 at 2:11 AM, André Malo wrote: > >> > * Adam Olsen wrote: > >> >> UTF-8 in percent encodings is becoming a defacto standard. > >> >> Otherwise the browser has to display the

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Adam Olsen
On Fri, Dec 12, 2008 at 9:47 PM, André Malo wrote: > * Adam Olsen wrote: >> On Fri, Dec 12, 2008 at 2:11 AM, André Malo wrote: >> > * Adam Olsen wrote: >> >> UTF-8 in percent encodings is becoming a defacto standard. Otherwise >> >> the browser has to display the percent escapes in the address b

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote: > On Fri, Dec 12, 2008 at 2:11 AM, André Malo wrote: > > * Adam Olsen wrote: > >> UTF-8 in percent encodings is becoming a defacto standard. Otherwise > >> the browser has to display the percent escapes in the address bar, > >> rather than the intended text. > > > > Duh! The

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread glyph
On 02:23 pm, c...@hagenlocher.org wrote: On Fri, Dec 12, 2008 at 6:19 AM, Antoine Pitrou wrote: Curt Hagenlocher hagenlocher.org> writes: No, but it also has to interact with filesystems of possibly invalid or indeterminate encodings. What does java.io do? My point was that Python doesn't

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Lennart Regebro
On Fri, Dec 12, 2008 at 16:21, Scott Dial wrote: > See the following email for a summary of existing practice (as of 2004): > > http://www.mail-archive.com/unic...@unicode.org/msg27352.html Interesting. Quite a lot of them do just drop the undecodable filenames. The Java solution with replacing i

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Toshio Kuratomi
Adam Olsen wrote: > UTF-8 in percent encodings is becoming a defacto standard. Otherwise > the browser has to display the percent escapes in the address bar, > rather than the intended text. > > IOW, inconsistent behaviour is a bug, but translating into UTF-8 is not. ;) > > I think we should le

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Scott Dial
Curt Hagenlocher wrote: > On Fri, Dec 12, 2008 at 6:19 AM, Antoine Pitrou wrote: >> Curt Hagenlocher hagenlocher.org> writes: >>> No, but it also has to interact with filesystems of possibly invalid >>> or indeterminate encodings. What does java.io do? >> My point was that Python doesn't have to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Curt Hagenlocher
On Fri, Dec 12, 2008 at 6:19 AM, Antoine Pitrou wrote: > Curt Hagenlocher hagenlocher.org> writes: >> >> No, but it also has to interact with filesystems of possibly invalid >> or indeterminate encodings. What does java.io do? > > My point was that Python doesn't have to interact with the Java I

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Antoine Pitrou
Curt Hagenlocher hagenlocher.org> writes: > > No, but it also has to interact with filesystems of possibly invalid > or indeterminate encodings. What does java.io do? My point was that Python doesn't have to interact with the Java IO libraries, while it has to interact with the Unix and Windows

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Curt Hagenlocher
On Fri, Dec 12, 2008 at 5:06 AM, Antoine Pitrou wrote: > > Curt Hagenlocher hagenlocher.org> writes: > > > There's this other obscure platform called "Java"... ;) > > Does it have a filesystem? No, but it also has to interact with filesystems of possibly invalid or indeterminate encodings. What

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Antoine Pitrou
Curt Hagenlocher hagenlocher.org> writes: > > > On Thu, Dec 11, 2008 at 10:19 PM, Adam Olsen gmail.com> wrote: > > > I doubt that UTF-16 is used very much (other than on windows). > > There's this other obscure platform called "Java"... ;) Does it have a filesystem? _

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Adam Olsen
On Fri, Dec 12, 2008 at 2:11 AM, André Malo wrote: > * Adam Olsen wrote: > >> UTF-8 in percent encodings is becoming a defacto standard. Otherwise >> the browser has to display the percent escapes in the address bar, >> rather than the intended text. > > Duh! The address bar should contain the UR

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Ulrich Eckhardt
On Friday 12 December 2008, Stephen J. Turnbull wrote: > I gather that the BFDL's line on this thread of discussion is that > forcing programmers to think about encodings every time they call out > to the OS is unacceptable Exactly that is not necessary. for n in os.readdir('.'): f = open

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Adam Olsen
On Fri, Dec 12, 2008 at 1:31 AM, Ulrich Eckhardt wrote: > On Thursday 11 December 2008, Adam Olsen wrote: >> The simplest solution there is to have windows bytes APIs that return >> raw UTF-16 bytes (note that windows does NOT guaranteed to be valid >> unicode, despite being much more likely than

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread André Malo
* Adam Olsen wrote: > UTF-8 in percent encodings is becoming a defacto standard. Otherwise > the browser has to display the percent escapes in the address bar, > rather than the intended text. Duh! The address bar should contain the URL, which *is* the intended text. The escapes are there for

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Stephen J. Turnbull
Toshio Kuratomi writes: > Adam Olsen wrote: > > On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull > > wrote: > >> Unfortunately, even programmers experienced in I18N like Martin, and > >> those with intuition-that-has-the-force-of-law like Guido, > >> express deliberate disbelief on this

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Ulrich Eckhardt
On Thursday 11 December 2008, Adam Olsen wrote: > The simplest solution there is to have windows bytes APIs that return > raw UTF-16 bytes (note that windows does NOT guaranteed to be valid > unicode, despite being much more likely than on linux). Actually, I'm not aware of this case. I only know

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Ulrich Eckhardt
On Thursday 11 December 2008, Steve Holden wrote: > Ulrich Eckhardt wrote: > > If readdir() returned Unicode text, people would start taking that for > > granted. If it returned bytes, just the same. Returning a completely > > unrelated type will give them enough hint that for this thing they have

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-12 Thread Adam Olsen
On Fri, Dec 12, 2008 at 12:33 AM, Toshio Kuratomi wrote: > Adam Olsen wrote: >> As a data point, firefox (when pointed at my home dir) DOES skip over >> garbage files. >> >> > That's not true. However, it looks like Firefox is actually broken. > Take a look at this screenshot: > firefox.png > >

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Toshio Kuratomi
Adam Olsen wrote: > As a data point, firefox (when pointed at my home dir) DOES skip over > garbage files. > > That's not true. However, it looks like Firefox is actually broken. Take a look at this screenshot: firefox.png That shows a directory with a folder that's not decodable in my utf-8

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Toshio Kuratomi
Adam Olsen wrote: > A half-broken setup is still a broken setup. Eventually you have to > tell people to stop screwing around and pick one encoding. > But it's not a broken setup. It's the way the world is because people share things with each other. > I doubt that UTF-16 is used very much (ot

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Adam Olsen
On Thu, Dec 11, 2008 at 11:25 PM, Curt Hagenlocher wrote: > On Thu, Dec 11, 2008 at 10:19 PM, Adam Olsen wrote: >> >> I doubt that UTF-16 is used very much (other than on windows). > > There's this other obscure platform called "Java"... ;) Sorry, I should have said "for interchange". :) (CPyth

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Adam Olsen
On Thu, Dec 11, 2008 at 10:22 PM, Adam Olsen wrote: > On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull > wrote: >> Unfortunately, even programmers experienced in I18N like Martin, and >> those with intuition-that-has-the-force-of-law like Guido, >> express deliberate disbelief on this point.

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Curt Hagenlocher
On Thu, Dec 11, 2008 at 10:19 PM, Adam Olsen wrote: > > I doubt that UTF-16 is used very much (other than on windows). > There's this other obscure platform called "Java"... ;) -- Curt Hagenlocher c...@hagenlocher.org ___ Python-Dev mailing list Pytho

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Adam Olsen
On Thu, Dec 11, 2008 at 10:41 PM, Toshio Kuratomi wrote: > Adam Olsen wrote: >> On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull >> wrote: >>> Unfortunately, even programmers experienced in I18N like Martin, and >>> those with intuition-that-has-the-force-of-law like Guido, >>> express delib

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Toshio Kuratomi
Adam Olsen wrote: > On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull > wrote: >> Unfortunately, even programmers experienced in I18N like Martin, and >> those with intuition-that-has-the-force-of-law like Guido, >> express deliberate disbelief on this point. They say that filesystem >> names

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Adam Olsen
On Thu, Dec 11, 2008 at 6:55 PM, Stephen J. Turnbull wrote: > Unfortunately, even programmers experienced in I18N like Martin, and > those with intuition-that-has-the-force-of-law like Guido, > express deliberate disbelief on this point. They say that filesystem > names and environment variable v

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Stephen J. Turnbull
Steve Holden writes: > Ulrich Eckhardt writes: > > What I'd just like some feedback on is the approach to return a > > distinct type (neither a byte string nor a Unicode string) from > > readdir(). This is presumably unacceptable on the grounds that it will break existing code that does somet

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Adam Olsen
On Thu, Dec 11, 2008 at 6:41 AM, Ulrich Eckhardt wrote: > On Thursday 11 December 2008, Steve Holden wrote: >> re-present it to the filesystem to manipulate the file. What are we >> supposed to do with the "special type"? > > You receive from readdir() and pass it to stat(), simple as that. No > c

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Steve Holden
Ulrich Eckhardt wrote: > On Thursday 11 December 2008, Steve Holden wrote: >> Ulrich Eckhardt wrote: >>> What I'd just like some feedback on is the approach to return a distinct >>> type (neither a byte string nor a Unicode string) from readdir(). In >>> order to use this, a programmer will have to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Isaac Morland
On Thu, 11 Dec 2008, Ulrich Eckhardt wrote: On Thursday 11 December 2008, Steve Holden wrote: Ulrich Eckhardt wrote: Seems to me this just threatens to add to the confusion. If you know what your filesystem produces, you can take the appropriate action to convert it into a type that makes sens

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Ulrich Eckhardt
On Thursday 11 December 2008, Steve Holden wrote: > Ulrich Eckhardt wrote: > > What I'd just like some feedback on is the approach to return a distinct > > type (neither a byte string nor a Unicode string) from readdir(). In > > order to use this, a programmer will have to convert it explicitly, >

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Steve Holden
Ulrich Eckhardt wrote: > On Wednesday 10 December 2008, Adam Olsen wrote: >> On Wed, Dec 10, 2008 at 3:39 AM, Ulrich Eckhardt >> >> <[EMAIL PROTECTED]> wrote: >>> On Tuesday 09 December 2008, Adam Olsen wrote: The only thing separating this from a bikeshed discussion is that a bikeshed ha

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-11 Thread Ulrich Eckhardt
On Wednesday 10 December 2008, Adam Olsen wrote: > On Wed, Dec 10, 2008 at 3:39 AM, Ulrich Eckhardt > > <[EMAIL PROTECTED]> wrote: > > On Tuesday 09 December 2008, Adam Olsen wrote: > >> The only thing separating this from a bikeshed discussion is that a > >> bikeshed has many equally good solution

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-10 Thread Adam Olsen
On Wed, Dec 10, 2008 at 3:39 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: > On Tuesday 09 December 2008, Adam Olsen wrote: >> The only thing separating this from a bikeshed discussion is that a >> bikeshed has many equally good solutions, while we have no good >> solutions. Instead we're trying

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-10 Thread Ulrich Eckhardt
On Tuesday 09 December 2008, Adam Olsen wrote: > On Tue, Dec 9, 2008 at 11:31 AM, Ulrich Eckhardt > > <[EMAIL PROTECTED]> wrote: > > On Monday 08 December 2008, Adam Olsen wrote: > >> At this point someone suggests we have a type that can store an > >> arbitrary mix of unicode and bytes, so the und

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Toshio Kuratomi
James Y Knight wrote: > On Dec 9, 2008, at 6:04 AM, Anders J. Munch wrote: >> The typical application will just obliviously use os.listdir(dir) and >> get the default elide-and-warn behaviour for un-decodable names. That >> rare special application > > I guess this is a new definition of rare spec

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Adam Olsen
On Tue, Dec 9, 2008 at 11:31 AM, Ulrich Eckhardt <[EMAIL PROTECTED]> wrote: > On Monday 08 December 2008, Adam Olsen wrote: >> At this point someone suggests we have a type that can store an >> arbitrary mix of unicode and bytes, so the undecodable portions stay >> in their original form. :P > > We

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Ulrich Eckhardt
On Monday 08 December 2008, Adam Olsen wrote: > At this point someone suggests we have a type that can store an > arbitrary mix of unicode and bytes, so the undecodable portions stay > in their original form. :P Well, not an arbitrary mix, but a type that just stores whatever comes from the syste

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread James Y Knight
On Dec 9, 2008, at 6:04 AM, Anders J. Munch wrote: The typical application will just obliviously use os.listdir(dir) and get the default elide-and-warn behaviour for un-decodable names. That rare special application I guess this is a new definition of rare special application: "an applicat

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Anders J. Munch
M.-A. Lemburg wrote: > > Well, this is not too far away from just putting the whole decoding > logic into the application directly: > > files = [filename.decode(filesystemencoding, errors='warnreplace') > for filename in os.listdir(dir)] > > (or os.listdirb() if that's where the discuss

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread André Malo
* M.-A. Lemburg wrote: > On 2008-12-09 09:41, Anders J. Munch wrote: > > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > try: > files = os.listdir(somedir, errors = strict) > except OSError as e: > log() > files = os.listdir(somedir) > > > > I

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread M.-A. Lemburg
On 2008-12-09 09:41, Anders J. Munch wrote: > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: try: files = os.listdir(somedir, errors = strict) except OSError as e: log() files = os.listdir(somedir) > > Instead of a codecs error handler name, how

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Nick Coghlan
Glenn Linderman wrote: > On approximately 12/8/2008 9:30 AM, came the following characters from > the keyboard of [EMAIL PROTECTED]: >> PS: I'd like to see a similar warning issued when an access attempt >> is made through os.environ to a variable that cannot be decoded. > > > And argv ? Seems l

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-09 Thread Anders J. Munch
On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: >>> try: >>> files = os.listdir(somedir, errors = strict) >>> except OSError as e: >>> log() >>> files = os.listdir(somedir) Instead of a codecs error handler name, how about a callback for converting bytes to str? os.listd

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Glenn Linderman
On approximately 12/8/2008 9:30 AM, came the following characters from the keyboard of [EMAIL PROTECTED]: If warnings were emitted, then files would not be silently ignored, yet the program could still be used. Yep, this is sounding useful. PS: I'd like to see a similar warning issued whe

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Terry Reedy
M.-A. Lemburg wrote: On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: try: files = os.listdir(somedir, errors = strict) except OSError as e: log() files = os.listdir(somedir) > If that error parameter is the same as in unicode(value, errors), then this would be a

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Adam Olsen
On Mon, Dec 8, 2008 at 2:44 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-12-08 22:32, Adam Olsen wrote: >> On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >>> On 2008-12-08 21:45, Antoine Pitrou wrote: M.-A. Lemburg egenix.com> writes: > Such application

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 22:39, Victor Stinner wrote: >> ('strict', 'ignore', 'replace', 'xmlcharrefreplace') > > replace (or xmlcharrefreplace) is just useless because you will not be unable > to open or rename the file... You just know that there is a strange file in > the directory. Right, but that's a

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Adam Olsen
On Mon, Dec 8, 2008 at 1:12 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On Mon, Dec 8, 2008 at 12:07 PM, <[EMAIL PROTECTED]> wrote: >> But I'm happy with just issuing a warning by default. That would mean >> it doesn't fail silently, but neither does it crash. Seems like the >> best compro

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 22:32, Adam Olsen wrote: > On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> On 2008-12-08 21:45, Antoine Pitrou wrote: >>> M.-A. Lemburg egenix.com> writes: Such application specific error handlers could then also apply whatever fancy round-trip s

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Victor Stinner
> ('strict', 'ignore', 'replace', 'xmlcharrefreplace') replace (or xmlcharrefreplace) is just useless because you will not be unable to open or rename the file... You just know that there is a strange file in the directory. ___ Python-Dev mailing list

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Toshio Kuratomi
Guido van Rossum wrote: > On Mon, Dec 8, 2008 at 12:07 PM, <[EMAIL PROTECTED]> wrote: >> On Mon, 8 Dec 2008 at 11:25, Guido van Rossum wrote: >> But I'm happy with just issuing a warning by default. That would mean >> it doesn't fail silently, but neither does it crash. Seems like the >> best co

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Adam Olsen
On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-12-08 21:45, Antoine Pitrou wrote: >> M.-A. Lemburg egenix.com> writes: >>> Such application specific error handlers could then also apply >>> whatever fancy round-trip safe encoding of non-decodable bytes >>> to Un

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Antoine Pitrou
Adam Olsen gmail.com> writes: > > Except they're clearly NOT part of the unicode spec. This is always the same discussion going in circles. I know they're not part of the unicode spec, but practicality beats purity and if the said error handler comes with an appropriate warning in the official d

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Adam Olsen
On Mon, Dec 8, 2008 at 1:45 PM, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > M.-A. Lemburg egenix.com> writes: >> >> Such application specific error handlers could then also apply >> whatever fancy round-trip safe encoding of non-decodable bytes >> to Unicode escapes, private code points, etc. as s

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Nick Coghlan
Terry Reedy wrote: > Nick Coghlan wrote: >> Terry Reedy wrote: >>> This to be is an argument for keeping the default the current behavior, >>> but not for rejecting flexibility. The computing world seems to be >>> messier than we would like and worse that I realized until this week. As >>> you say

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 21:45, Antoine Pitrou wrote: > M.-A. Lemburg egenix.com> writes: >> Such application specific error handlers could then also apply >> whatever fancy round-trip safe encoding of non-decodable bytes >> to Unicode escapes, private code points, etc. as seen fit by the >> application. >

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Antoine Pitrou
M.-A. Lemburg egenix.com> writes: > > Such application specific error handlers could then also apply > whatever fancy round-trip safe encoding of non-decodable bytes > to Unicode escapes, private code points, etc. as seen fit by the > application. I'd argue that such fancy round-trip safe error

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-08 19:26, Guido van Rossum wrote: > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: >> Here is a possible use case: I want filenames as 3.0 strings and I >> anticipate no problems at present but, as you say above, something might >> happen years in the future. I a

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Guido van Rossum
On Mon, Dec 8, 2008 at 12:07 PM, <[EMAIL PROTECTED]> wrote: > On Mon, 8 Dec 2008 at 11:25, Guido van Rossum wrote: >> >> On Mon, Dec 8, 2008 at 10:34 AM, <[EMAIL PROTECTED]> wrote: >>> >>> I'm in favor of an option to control what happens. >>> >>> I just really really don't want the _default_ to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread rdmurray
On Mon, 8 Dec 2008 at 11:25, Guido van Rossum wrote: On Mon, Dec 8, 2008 at 10:34 AM, <[EMAIL PROTECTED]> wrote: I'm in favor of an option to control what happens. I just really really don't want the _default_ to be "ignore". Defaulting to a warning is fine with me, as would be defaulting to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Bugbee, Larry
> I'm perhaps biased here; most of my Python programs don't have user > interfaces, because they don't "talk" to people, they talk to other > programs. The binary APIs for the OS are essential. I use and > deeply appreciate all the string handling features in Python, > particularly its firm g

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Scott Dial
Guido van Rossum wrote: > On Mon, Dec 8, 2008 at 10:34 AM, <[EMAIL PROTECTED]> wrote: >> On Mon, 8 Dec 2008 at 13:16, Terry Reedy wrote: And the decoding problems don't pass silently either - they just get emitted as a warning by default instead of causing the application to cras

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Guido van Rossum
On Mon, Dec 8, 2008 at 10:34 AM, <[EMAIL PROTECTED]> wrote: > On Mon, 8 Dec 2008 at 13:16, Terry Reedy wrote: >>> >>> And the decoding problems don't pass silently either - they just get >>> emitted as a warning by default instead of causing the application to >>> crash. >> >> Do they get autom

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread rdmurray
On Mon, 8 Dec 2008 at 13:16, Terry Reedy wrote: And the decoding problems don't pass silently either - they just get emitted as a warning by default instead of causing the application to crash. Do they get automatically logged? In any case, the errors parameter has an in between option to

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Guido van Rossum
On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> >> On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: >>> >>> Toshio Kuratomi wrote: >>> - If this is true, a definition of os.listdir() that would better meet programm

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Terry Reedy
Nick Coghlan wrote: Terry Reedy wrote: This to be is an argument for keeping the default the current behavior, but not for rejecting flexibility. The computing world seems to be messier than we would like and worse that I realized until this week. As you say below, people need to better anticip

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Bill Janssen
Nick Coghlan <[EMAIL PROTECTED]> wrote: > - I think the binary and Unicode APIs should be available (and fully > functional) on all platforms (including Windows) so that app developers > don't create portability problems for themselves when they make the > decision as to which API to use +1 I'm

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread rdmurray
On Sun, 7 Dec 2008 at 13:33, Guido van Rossum wrote: My problem with raising exceptions *by default* when an undecodable name exists is that it may render an app completely useless in a situation where the developer is no longer around. This happened all I think Nick Coghlan's suggestion of emi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread M.-A. Lemburg
On 2008-12-06 01:48, Nick Coghlan wrote: > You can't display a non-decodable filename to the user, hence the user > will have no idea what they're working on. Non-filesystem related apps > have no business trying to deal with insane filenames. This is not entirely true: OSes, shells, and applicati

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Ulrich Eckhardt
On Sunday 07 December 2008, Guido van Rossum wrote: > My problem with raising exceptions *by default* when an undecodable > name exists is that it may render an app completely useless in a > situation where the developer is no longer around. This happened all > the time with the 2.x Unicode API, wh

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Nick Coghlan
Terry Reedy wrote: > This to be is an argument for keeping the default the current behavior, > but not for rejecting flexibility. The computing world seems to be > messier than we would like and worse that I realized until this week. As > you say below, people need to better anticipate the future,

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Glenn Linderman
On approximately 12/8/2008 12:57 AM, came the following characters from the keyboard of Stephen J. Turnbull: "Internal decoding" is (or should be) an oxymoron. Why would your software be passing around text in any format other than internal? So decoding will happen (a) on I/O, which is itself

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Ulrich Eckhardt
On Friday 05 December 2008, James Y Knight wrote: > On Dec 5, 2008, at 5:27 AM, Ulrich Eckhardt wrote: > > Using the byte variant is equally fubar, because e.g. on MS Windows > > it is not supported, except through a very lossy roundtrip through > > the locale's codepage, limiting your functionalit

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Stephen J. Turnbull
Glenn Linderman writes: > "significantly" seems to be the only word at question; it seems that > there are a fair number of validation checks that could be performed; > the numeric part of UTF-8 decoding is just a sequence of shifts, masks, > and ORs, so can be coded pretty tightly in C or

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-08 Thread Stephen J. Turnbull
Glenn Linderman writes: > On approximately 12/7/2008 8:13 PM, came the following characters from > I have no problem with having strict validation available. But > doesn't validation take significantly longer than decoding? I think you're thinking of XML, where validation can take significan

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 11:04 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: > On approximately 12/7/2008 9:11 PM, came the following characters from the > keyboard of Adam Olsen: >> On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> >> wrote: > > Once upon a time I did write an unv

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Glenn Linderman
On approximately 12/7/2008 9:11 PM, came the following characters from the keyboard of Adam Olsen: On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: On approximately 12/7/2008 8:13 PM, came the following characters from the keyboard of Stephen J. Turnbull: Glenn Linderm

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: > On approximately 12/7/2008 8:13 PM, came the following characters from the > keyboard of Stephen J. Turnbull: >> >> Glenn Linderman writes: >> >> > But if you are interested in checking for security issues, shouldn't >> y

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Glenn Linderman
On approximately 12/7/2008 8:13 PM, came the following characters from the keyboard of Stephen J. Turnbull: Glenn Linderman writes: > But if you are interested in checking for security issues, shouldn't you > _first_ decode into some canonical form, Yes. That's all that is being asked fo

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Stephen J. Turnbull
Glenn Linderman writes: > But if you are interested in checking for security issues, shouldn't you > _first_ decode into some canonical form, Yes. That's all that is being asked for: that Python do strict decoding to a canonical form by default. That's a lot to ask, as it turns out, but th

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Glenn Linderman
On approximately 12/7/2008 10:56 AM, came the following characters from the keyboard of Adam Olsen: You might receive a UTF-8 encoded file name from a malicious user, check if it contains something dangerous (like "../../../../../etc/password"), then decode it. If your decoder isn't compliant

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Terry Reedy
Guido van Rossum wrote: On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: Toshio Kuratomi wrote: - If this is true, a definition of os.listdir() that would better meet programmer expectation would be: "Give me all files in a directory with the output as str type". The de

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Greg Ewing
Nick Coghlan wrote: For binary wrappers around the Windows Unicode APIs, I was thinking specifically of using UTF-8, since that should be able to encode anything the Unicode APIs can handle. Why shouldn't the binary interface just expose the raw utf16 as bytes? -- Greg ___

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Nick Coghlan
Terry Reedy wrote: > Toshio Kuratomi wrote: > >> - If this is true, a definition of os.listdir() that would >> better meet programmer expectation would be: "Give me all files in a >> directory with the output as str type". The definition of >> os.listdir() would be "Give me all files in a direc

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Guido van Rossum
On Sun, Dec 7, 2008 at 1:20 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Toshio Kuratomi wrote: > >> - If this is true, a definition of os.listdir() that would >> better meet programmer expectation would be: "Give me all files in a >> directory with the output as str type". The definition of >> o

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Terry Reedy
Toshio Kuratomi wrote: - If this is true, a definition of os.listdir() that would better meet programmer expectation would be: "Give me all files in a directory with the output as str type". The definition of os.listdir() would be "Give me all files in a directory with the output as bytes typ

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 11:18 AM, Michael Urman <[EMAIL PROTECTED]> wrote: > On Sun, Dec 7, 2008 at 11:35, Adam Olsen <[EMAIL PROTECTED]> wrote: http://bugs.python.org/issue3672 http://bugs.python.org/issue3297 >> >> No. Unicode *requires* them to be treated as errors. If you want to >>

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Michael Urman
On Sun, Dec 7, 2008 at 11:35, Adam Olsen <[EMAIL PROTECTED]> wrote: >>> http://bugs.python.org/issue3672 >>> http://bugs.python.org/issue3297 > > No. Unicode *requires* them to be treated as errors. If you want to > pass them through then you're creating a custom encoding... which you > might arg

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Toshio Kuratomi
[EMAIL PROTECTED] wrote: > > On 06:07 am, [EMAIL PROTECTED] wrote: >> Most apps aren't file managers or ftp clients but when they interact >> with files (for instance, a file selection dialog) they need to be able >> to show the user all the relevant files. So on an app-by-app basis the >> need f

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 2:35 AM, Hagen Fürstenau <[EMAIL PROTECTED]> wrote: >>> As far as I can see all Python Unicode strings can be encoded to UTF-8, >>> even things like lone surrogates because Python doesn't care about them. >>> So both the Unicode API and the binary API would be fail-safe on Wi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Hagen Fürstenau
>> As far as I can see all Python Unicode strings can be encoded to UTF-8, >> even things like lone surrogates because Python doesn't care about them. >> So both the Unicode API and the binary API would be fail-safe on Windows. > > Python is broken and needs to be fixed. > > http://bugs.python.or

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Adam Olsen
On Sun, Dec 7, 2008 at 2:07 AM, Hagen Fürstenau <[EMAIL PROTECTED]> wrote: >> If the Unicode APIs only have correct unicode, sure. If not you'll >> get errors translating to UTF-8 (and the byte APIs are supposed to >> pass bad names through unaltered.) Kinda ironic, no? > > As far as I can see al

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-07 Thread Hagen Fürstenau
> If the Unicode APIs only have correct unicode, sure. If not you'll > get errors translating to UTF-8 (and the byte APIs are supposed to > pass bad names through unaltered.) Kinda ironic, no? As far as I can see all Python Unicode strings can be encoded to UTF-8, even things like lone surrogate

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread glyph
On 06:07 am, [EMAIL PROTECTED] wrote: Guido van Rossum wrote: On Sat, Dec 6, 2008 at 10:53 AM, <[EMAIL PROTECTED]> wrote: I find it interesting to note that the only users in this discussion who actually have these problems in real life all have this attitude. For file managers and simi

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Toshio Kuratomi
Guido van Rossum wrote: > On Sat, Dec 6, 2008 at 10:53 AM, <[EMAIL PROTECTED]> wrote: >> I find it interesting to note that the only users in this discussion who >> actually have these problems in real life all have this attitude. It is >> expected that in an imperfect world we will have imperfe

Re: [Python-Dev] Python-3.0, unicode, and os.environ

2008-12-06 Thread Adam Olsen
On Sat, Dec 6, 2008 at 6:51 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > André Malo wrote: >>> While on Windows: >>> - underlying OS API uses Unicode >>> - Unicode API just passes values straight through >>> - binary API uses the system encoding to decode bytes names and values >>> to be passed to

  1   2   >