Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-08 Thread David Hopwood
Martin v. Löwis wrote: > David Hopwood schrieb: >>Michael Foord wrote: >>>David Hopwood wrote:[snip..] >>> >>we should, of course, continue to use the one we always used (for >>"ascii", there is no difference between the two). > >+1 > >This seems the most (only ?) logical so

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-08 Thread M.-A. Lemburg
Armin Rigo wrote: > Hi, > > On Thu, Aug 03, 2006 at 07:53:11PM +0200, M.-A. Lemburg wrote: >>> I though I'd heard (from Guido here or on the py3k list) that it was only >>> 1 < u'abc' that would raise an exception, and that 1 == u'abc' would still >>> evaluate to False. Did I misunderstand? >>

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
Armin Rigo schrieb: > I also seem to remember that TypeErrors should only signal ordering > non-sense, not equality. In this case, I'm on the opinion that unicode > objects and completely-unrelated strings of random bytes should > successfully compare as unequal, but I'm not enough of a unicode us

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
David Hopwood schrieb: > Michael Foord wrote: >> David Hopwood wrote:[snip..] >> > we should, of course, continue to use the one we always used (for > "ascii", there is no difference between the two). +1 This seems the most (only ?) logical solution. >>> No; always considerin

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Ron Adam
Michael Foord wrote: > David Hopwood wrote:[snip..] >> we should, of course, continue to use the one we always used (for "ascii", there is no difference between the two). >>> +1 >>> >>> This seems the most (only ?) logical solution. >>> >> No; always considering Unicod

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread David Hopwood
Michael Foord wrote: > David Hopwood wrote:[snip..] > we should, of course, continue to use the one we always used (for "ascii", there is no difference between the two). >>> >>> +1 >>> >>> This seems the most (only ?) logical solution. >> >> No; always considering Unicode and non-ASCII b

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Michael Foord
David Hopwood wrote:[snip..] > > >>> we should, of course, continue to use the one we always used (for >>> "ascii", there is no difference between the two). >>> >> +1 >> >> This seems the most (only ?) logical solution. >> > > No; always considering Unicode and non-ASCII byte strings

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Armin Rigo
Hi, On Thu, Aug 03, 2006 at 07:53:11PM +0200, M.-A. Lemburg wrote: > > I though I'd heard (from Guido here or on the py3k list) that it was only > > 1 < u'abc' that would raise an exception, and that 1 == u'abc' would still > > evaluate to False. Did I misunderstand? > > Could be that I'm wron

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
David Hopwood schrieb: > I disagree. Unicode strings should always be considered distinct from > non-ASCII byte strings. Implicitly encoding or decoding in order to > perform a comparison is a bad idea; it is expensive and will often do > the wrong thing. That's a pretty irrelevant position at thi

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread David Hopwood
Michael Foord wrote: > Martin v. Löwis wrote: > >>[snip..] >>Expanding this view to Unicode should mean that a unicode >>string U equals a byte string B if >>U.encode(system_encode) == B or B.decode(system_encoding) == U, >>and that they don't equal otherwise (e.g. if the conversion >>fails with a

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Michael Foord
Martin v. Löwis wrote: > [snip..] > Expanding this view to Unicode should mean that a unicode > string U equals a byte string B if > U.encode(system_encode) == B or B.decode(system_encoding) == U, > and that they don't equal otherwise (e.g. if the conversion > fails with a "not convertible" excepti

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
M.-A. Lemburg schrieb: >> There's no disputing that an exception should be raised >> if the string *must* be interpretable as characters in >> order to continue. But that's not true here if you allow >> for the interpretation that they're simply objects of >> different (duck) type and therefore une

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Ralf Schmitt
Christopher Armstrong wrote: > On 8/4/06, *Ralf Schmitt* <[EMAIL PROTECTED] > > wrote: > > > Maybe this is all just a matter of choosing the right > defaultencoding ? :) > > > > Doing this is amazingly stupid. I can't believe how often I hear this > suggesti

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Christopher Armstrong
On 8/4/06, Ralf Schmitt <[EMAIL PROTECTED]> wrote: Jean-Paul Calderone wrote:>> I like the exception that 2.5 raises.  I only wish it raised by default> when using 'ascii' and u'ascii' as keys in the same dictionary. ;)  Oh,> and that str and unicode did not hash like they do.  ;) No problem: >>> i

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Bob Ippolito
On Aug 3, 2006, at 9:34 PM, Josiah Carlson wrote: > > Bob Ippolito <[EMAIL PROTECTED]> wrote: >> On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: >> >>> M.-A. Lemburg wrote: >>> Perhaps we ought to add an exception to the dict lookup mechanism and continue to silence UnicodeErrors ?! >>> >

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread M.-A. Lemburg
Ralf Schmitt wrote: > M.-A. Lemburg wrote: >> Ralf Schmitt wrote: >>> Does python 2.4 catch any exception when comparing keys (which are not >>> basestrings) in dictionaries? >> Yes. It does so for all equality compares that need to be done >> as part of the hash collision algorithm (not only w/r

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread M.-A. Lemburg
Greg Ewing wrote: > M.-A. Lemburg wrote: > >> If a string >> is not ASCII and thus causes the exception, there's not a lot you >> can say, since you don't know the encoding of the string. > > That's one way of looking at it. > > Another is that any string containing chars > 127 is not > text at

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Michael Hudson
"M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > The point here is that a typical user won't expect any comparisons > to be made when dealing with dictionaries, simply because the fact > that you do need to make comparisons is an implementation detail. Of course looking things up in a dictionary inv

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Ralf Schmitt
M.-A. Lemburg wrote: > Ralf Schmitt wrote: >> Does python 2.4 catch any exception when comparing keys (which are not >> basestrings) in dictionaries? > > Yes. It does so for all equality compares that need to be done > as part of the hash collision algorithm (not only w/r to strings > and Unicode

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Greg Ewing
M.-A. Lemburg wrote: > If a string > is not ASCII and thus causes the exception, there's not a lot you > can say, since you don't know the encoding of the string. That's one way of looking at it. Another is that any string containing chars > 127 is not text at all, but binary data, in which case

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread M.-A. Lemburg
Ralf Schmitt wrote: > Does python 2.4 catch any exception when comparing keys (which are not > basestrings) in dictionaries? Yes. It does so for all equality compares that need to be done as part of the hash collision algorithm (not only w/r to strings and Unicode, but in general). This was chan

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Ralf Schmitt
Jean-Paul Calderone wrote: > > I like the exception that 2.5 raises. I only wish it raised by default > when using 'ascii' and u'ascii' as keys in the same dictionary. ;) Oh, > and that str and unicode did not hash like they do. ;) No problem: >>> import sys >>> reload(sys) >>> sys.setdef

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread Giovanni Bajo
M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> At the very least, in the context of a dictionary >> lookup. > > The point here is that a typical user won't expect any comparisons > to be made when dealing with dictionaries, simply because the fact > that you do need to make comparisons is an implemen

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-04 Thread M.-A. Lemburg
Delaney, Timothy (Tim) wrote: > M.-A. Lemburg wrote: > >> Perhaps we ought to add an exception to the dict lookup mechanism >> and continue to silence UnicodeErrors ?! > > I'd definitely consider a UnicodeError to be an indication that two > objects are not equal. Not really: Python expects all

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Michael Urman
On 8/3/06, Josiah Carlson <[EMAIL PROTECTED]> wrote: > As an alternate idea, rather than attempting to .decode('ascii') when > strings and unicode compare, why not .decode('latin-1')? We lose the > unicode decoding error, but "the right thing" happens (in my opinion) > when u'\xa1' and '\xa1' comp

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Jean-Paul Calderone
On Thu, 03 Aug 2006 21:34:04 -0700, Josiah Carlson <[EMAIL PROTECTED]> wrote: > >Bob Ippolito <[EMAIL PROTECTED]> wrote: >> On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: >> >> > M.-A. Lemburg wrote: >> > >> >> Perhaps we ought to add an exception to the dict lookup mechanism >> >> and continue to s

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread James Y Knight
On Aug 4, 2006, at 12:34 AM, Josiah Carlson wrote: > As an alternate idea, rather than attempting to .decode('ascii') when > strings and unicode compare, why not .decode('latin-1')? We lose the > unicode decoding error, but "the right thing" happens (in my opinion) > when u'\xa1' and '\xa1' compa

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Josiah Carlson
Bob Ippolito <[EMAIL PROTECTED]> wrote: > On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: > > > M.-A. Lemburg wrote: > > > >> Perhaps we ought to add an exception to the dict lookup mechanism > >> and continue to silence UnicodeErrors ?! > > > > Seems to be that comparison of unicode and non-unicod

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Bob Ippolito
On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote: > M.-A. Lemburg wrote: > >> Perhaps we ought to add an exception to the dict lookup mechanism >> and continue to silence UnicodeErrors ?! > > Seems to be that comparison of unicode and non-unicode > strings for equality shouldn't raise exceptions in t

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread James Y Knight
On Aug 3, 2006, at 5:47 PM, M.-A. Lemburg wrote: >> The only way this error could be the right thing is if you were >> trying >> to suggest that he shouldn't mix unicode and bytestrings at all. > > Good question. I wonder whether that's a reasonable approach for > Python 2.x (I'd say it is for Py

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Greg Ewing
M.-A. Lemburg wrote: > Perhaps we ought to add an exception to the dict lookup mechanism > and continue to silence UnicodeErrors ?! Seems to be that comparison of unicode and non-unicode strings for equality shouldn't raise exceptions in the first place. -- Greg _

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Michael Urman
On 8/3/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > > ...but in the case of dictionaries this behaviour has changed and in > > prior versions of python dictionaries did work as I expected them to. > > Now they don't. > > Let's put it this way: Python 2.5 uncovered a bug in your > application that

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Delaney, Timothy (Tim)
M.-A. Lemburg wrote: > Perhaps we ought to add an exception to the dict lookup mechanism > and continue to silence UnicodeErrors ?! I'd definitely consider a UnicodeError to be an indication that two objects are not equal. At the very least, in the context of a dictionary lookup. Tim Delaney ___

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread M.-A. Lemburg
Jim Jewett wrote: > http://mail.python.org/pipermail/python-dev/2006-August/067934.html > M.-A. Lemburg mal at egenix.com > >> Ralf Schmitt wrote: >>> Still trying to port our software. here's another thing I noticed: > >>> d = {} >>> d[u'm\xe1s'] = 1 >>> d['m\xe1s'] = 1 >>> print d > > (a 2-ele

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread M.-A. Lemburg
John J Lee wrote: > On Thu, 3 Aug 2006, M.-A. Lemburg wrote: > [...] >> It's actually a good preparation for Py3k where 1 == u'abc' will >> (likely) also raise an exception. > > I though I'd heard (from Guido here or on the py3k list) that it was only > 1 < u'abc' that would raise an exception, a

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread John J Lee
On Thu, 3 Aug 2006, M.-A. Lemburg wrote: [...] > It's actually a good preparation for Py3k where 1 == u'abc' will > (likely) also raise an exception. I though I'd heard (from Guido here or on the py3k list) that it was only 1 < u'abc' that would raise an exception, and that 1 == u'abc' would stil

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread M.-A. Lemburg
Ralf Schmitt wrote: Still trying to port our software. here's another thing I noticed: d = {} d[u'm\xe1s'] = 1 d['m\xe1s'] = 1 print d With python 2.5 I get: $ python2.5 t2.py Traceback (most recent call last): File "t2.py", line 3, in

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Ralf Schmitt
M.-A. Lemburg wrote: > Ralf Schmitt wrote: >> Ralf Schmitt wrote: >>> Still trying to port our software. here's another thing I noticed: >>> >>> d = {} >>> d[u'm\xe1s'] = 1 >>> d['m\xe1s'] = 1 >>> print d >>> >>> With python 2.4 I can add those two keys to the dictionary and get: >>> $ python2.4 t2

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Bob Ippolito
On Aug 3, 2006, at 9:51 AM, M.-A. Lemburg wrote: > Ralf Schmitt wrote: >> Ralf Schmitt wrote: >>> Still trying to port our software. here's another thing I noticed: >>> >>> d = {} >>> d[u'm\xe1s'] = 1 >>> d['m\xe1s'] = 1 >>> print d >>> >>> With python 2.4 I can add those two keys to the dictiona

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread M.-A. Lemburg
Ralf Schmitt wrote: > Ralf Schmitt wrote: >> Still trying to port our software. here's another thing I noticed: >> >> d = {} >> d[u'm\xe1s'] = 1 >> d['m\xe1s'] = 1 >> print d >> >> With python 2.4 I can add those two keys to the dictionary and get: >> $ python2.4 t2.py >> {u'm\xe1s': 1, 'm\xe1s': 1

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-03 Thread Ralf Schmitt
Ralf Schmitt wrote: > Still trying to port our software. here's another thing I noticed: > > d = {} > d[u'm\xe1s'] = 1 > d['m\xe1s'] = 1 > print d > > With python 2.4 I can add those two keys to the dictionary and get: > $ python2.4 t2.py > {u'm\xe1s': 1, 'm\xe1s': 1} > > With python 2.5 I get: