Re: [Development] Why can't QString use UTF-8 internally?

Konstantin Ritt Wed, 11 Feb 2015 09:17:36 -0800

FYI: Unicode codepoint != character visual representation. Moreover, a
single character could be represented with  a sequence of glyps or vice
versa - a sequence of characters could be represented with a single glyph.
QString (and every other Unicode string class in the world) represents a
sequence of Unicode codepoints (in this or that UTF), not characters or
glyphs - always remember that!


Regards,
Konstantin

2015-02-11 20:49 GMT+04:00 Matthew Woehlke <[email protected]>:

> On 2015-02-11 11:29, Thiago Macieira wrote:
> > On Wednesday 11 February 2015 11:22:59 Julien Blanc wrote:
> >> On 11/02/2015 10:32, Bo Thorsen wrote:
> >>> 2) length() returns the number of chars I see on the screen, not a
> >>> random implementation detail of the chosen encoding.
> >>
> >> How’s that supposed to work with combining characters, which are part of
> >> unicode ?
> >
> > That's true. And add that there are some zero-width characters too and
> some
> > characters that are double-width.
>
> I'm not going to claim this is the *best* answer, but at least one that
> seems logical... length() should be the number of times one must hit
> backspace starting from the end of the text to erase the entire text.
> IOW, the number of logical glyphs. Double-width characters are one
> logical glyph. Combining characters are not independently logical glyphs
> (e.g. 'ñ' is one glyph, regardless of how it is encoded).
>
> Conversely, I'm sure there are times when you need to know the number of
> codepoints (e.g. allocating memory to make a copy). Possibly length()
> and size() should return different results. (Which is a mess, but...)
>
> --
> Matthew
>
> _______________________________________________
> Development mailing list
> [email protected]
> http://lists.qt-project.org/mailman/listinfo/development
>

_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Re: [Development] Why can't QString use UTF-8 internally?

Reply via email to