>i understand that django's architecture should use unicode because it's >the better way, but from the "outside"... what functionality is not >working fine with non-english characters?
There are loads of things that don't work - actually anything that has a notion of a character but get's fed with a bytestring won't work. On the surface these are things like .upper() and .lower() not working for those languages that do support them - like German, where the normal chars are uppered, but the umlauts 'äöü' aren't. Other things that aren't directly visible that don't work are things like getting the correct length of a string - you get the count of bytes, not the count of chars (although without normalization of unicode strings, this won't work by just switching to unicode - and even with normalization there are edge cases where this doesn't work, because in Unicode it's not one char == one codepoint, but with utf-8 it's much worse). Same things that don't work are using regexps with unicode char classes or pulling out defined char indexes or replacing single chars - you would have to turn the string into unicode for any of those things to work. Other things that are problematic: utf-8 errors are thrown in a rather "lazy" fashion - they aren't thrown when the exact error actually occurs, like when reading the data, but are thrown when there is some unicode conversion happening. For example bad data in your database would only produce problems when you try to use the feed generator, as that one seems to rely on unicode, at least partially. And of course most standard Python libs just won't use their unicode capabilities, because they never see unicode, but only get fed bytestrings. I already mentioned the regexp lib and the string methods, but there are much more things that can make good use of unicode, like the whole XML stuff - currently you are up to your own to turn the unicode strings returned from those into utf-8 bytestrings, because otherwise the Django core will become upset. With switching to full unicode internally, we will work much better with the standard lib, because the standard lib already does prefer a full-unicode environment. bye, Georg