Chris Angelico <[email protected]>:
> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa <[email protected]> wrote:
>> Chris Angelico <[email protected]>:
>> Then, why bother with Unicode to begin with? Why not just use bytes?
>> After all, Python3's strings have the very same pitfalls:
>>
>> - you don't know the length of a text in characters
>> - chr(n) doesn't return a character
>> - you can't easily find the 7th character in a piece of text
>
> First you have to define "character".
I'm referring to the
Grapheme clusters, a.k.a.real characters
>> - you can't compare the equality of two pieces of text
>> - you can't use a piece of text as a reliable dict key
>
> (Dict key usage is defined in terms of equality, so these two are the
> same concern.)
Ideally, yes. However, someone might say, "don't use == to compare
equality; use unicode.textually_equal() instead". That advise might
satisfy the first requirement but not the second.
> Yes, you can. For most purposes, textual equality should be defined in
> terms of NFC or NFD normalization. Python already gives you that. You
> could argue that a string should always be stored NFC (or NFD, take
> your pick), and then the equality operator would handle this; but I'm
> not sure the benefit is worth it.
As I said, Python3's strings are neither here nor there. They don't
quite solve the problem Python2's strings had. They will push the
internationalization problems a bit farther out but fall short of the
mark.
he developer still has to worry a lot. Unicode seemingly solved one
problem only to present the developer of a bagful of new problems.
And if Python3's strings are a half-measure, why not stick to bytes?
> If you're trying to use strings as identifiers in any way (say, file
> names, or document lookup references), using the NFC/NFD normalized
> form of the string should be sufficient.
Show me ten Python3 database applications, and I'll show you ten Python3
database applications that don't normalize their primary keys.
Marko
--
https://mail.python.org/mailman/listinfo/python-list