STINNER Victor <victor.stin...@haypocalc.com> added the comment: > Victor, This looks like your cup of tee.
Unicode is my cup of tee, but not programs considering that bytes are characters. <a byte string>.isalpha() doesn't mean anything to me :-) This issue is a more question about the C library, not about Python :-) So try the attached program "isalpha.c" if you would like to test your libc. Results on my Linux box (Debian Sid, eglibc 2.11.2): ---------------- $ ./isalpha C ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52) $ ./isalpha fr_FR.UTF-8 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52) $ ./isalpha fr_FR.iso88591 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (117) $ ./isalpha fr_fr.iso885...@euro ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xa6\xa8\xaa\xb4\xb5\xb8\xba\xbc\xbd\xbe\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (124) ---------------- If your libc consider that \xff is a valid UTF-8 character, you should change your OS for a better one :-) -- > >>> len(letters) > 117 > ... > >>> locale.setlocale(locale.LC_CTYPE) > 'en_US.UTF-8' It looks like Mac OS X uses ISO-8859-1 instead of UTF-8. -- string.letters is built using strop.lowercase + strop.uppsercase which are built using the C functions islower() and islower(). locale.setlocale() regenerates strop/string.lowercase, strop/string.uppercase and string.letters for LC_CTYPE and LC_ALL categories. -- You don't need to run IDLE or import Tkinter to set the locale: import locale; locale.setlocale(locale.LC_ALL, '') is enough. -- A library should not change the locale (only the application). $ python2.6 >>> import locale >>> locale.getlocale() (None, None) >>> import Tkinter >>> locale.getlocale() ('fr_FR', 'UTF8') => Tkinter is an horrible library! (The bug is in the C library, not in the Python wrapper) Use a better one like Gtk ou Qt ;-) $ python >>> import locale >>> import pygtk >>> locale.getlocale() (None, None) >>> import PyQt4 >>> locale.getlocale() (None, None) (IDLE is based on Tkinter) -- I don't understand why Alexander gets different results on Python 2.6 and Python 2.7. @belopolsky: Are both programs linked to (built with?) the same C library? (same libray version) ---------- Added file: http://bugs.python.org/file18202/isalpha.c _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9335> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com