Nope just TWO code points e.g. U+1F1FA: REGIONAL INDICATOR SYMBOL LETTER U) followed by 🇸 (U+1F1F8: REGIONAL INDICATOR SYMBOL LETTER S) for the US flag,
-----Original Message----- From: Development <development-boun...@qt-project.org> On Behalf Of Giuseppe D'Angelo via Development Sent: 11 June 2024 20:09 To: development@qt-project.org Subject: Re: [Development] Are char literals L1 or U8 in Qt? Il 11/06/24 11:36, David C. Partridge ha scritto: > Anyone iterating bytewise over a char[] in UTF-8 has also got serious > bugs given that a UTF-8 "graphic character" can be up to 8 bytes > (national flags comprise two UTF-8 code points). There's no such thing as a UTF-8 "graphic character". Grapheme sequences are treated at a higher level anyhow in Qt, and we have APIs for that (QTextBoundaryFinder, etc.). And it's not 2. 🏴 is 7 code points. My 2 c, -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - Trusted Software Excellence -- Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development