A version of this came up not long ago in a slightly different context (bug
17369: parse() doesn't honor unicode in NFD normalization).
The basic issue is that there are different unicode normalizations (look it
up...).
Briefly, accented characters exist in two forms, one as a single code poin
On 18/11/2019 10:11 a.m., Björn Fisseler wrote:
Hello,
I'm struggling comparing two strings, which come from different data
sets. This strings are identical: "Alexander Jäger"
But when I compare these strings: string1 == string2
the result is FALSE.
Looking at the raw bytes used to encode the
Thank you! That solved my problem!
Best
Björn
Am 18.11.19 um 16:34 schrieb Ivan Krylov:
> On Mon, 18 Nov 2019 16:11:44 +0100
> "Björn Fisseler" wrote:
>
>> It's obviously the umlaut "ä" in this example which is encoded with
>> two respectively three bytes. The question is how to change
On Mon, 18 Nov 2019 16:11:44 +0100
"Björn Fisseler" wrote:
> It's obviously the umlaut "ä" in this example which is encoded with
> two respectively three bytes. The question is how to change this?
Welcome to the wonderful world of Unicode-related problems! It is,
indeed, possible to represent th
4 matches
Mail list logo