That's not the current encoding scheme for *national* flags. The flag for Wales if done using "regional flag encoding" based on modifiers to "Waving black flag" (U+1F3F4) which I agree can have up to seven but in any case, the critical point we agree on is the > 1 byte issue...
D. -----Original Message----- From: Giuseppe D'Angelo <giuseppe.dang...@kdab.com> Sent: 12 June 2024 10:02 To: Edward Welbourne <edward.welbou...@qt.io>; David C. Partridge <david.partri...@perdrix.co.uk>; development@qt-project.org Subject: Re: [Development] Are char literals L1 or U8 in Qt? On 12/06/2024 10:51, Edward Welbourne wrote: > I'll trust Peppe's count is thus of bytes in UTF-8. No, it's 7 code *points*. Regional flags have a complicated encoding scheme. Wales' flag is encoded as: U+1F3F4 WAVING BLACK FLAG U+E0067 TAG LATIN SMALL LETTER G U+E0062 TAG LATIN SMALL LETTER B U+E0077 TAG LATIN SMALL LETTER W U+E006C TAG LATIN SMALL LETTER L U+E0073 TAG LATIN SMALL LETTER S U+E007F CANCEL TAG Each one requires 4 UTF-8 code units, that is, a total of 28 bytes. My point was that Unicode is incredibly complicated, and one should just use higher-level facilities that know how to do this. My 2 c, -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - Trusted Software Excellence -- Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development