Re: [Development] Are char literals L1 or U8 in Qt?

Giuseppe D'Angelo via Development Tue, 11 Jun 2024 12:09:21 -0700

Il 11/06/24 07:12, Thiago Macieira ha scritto:

I'm arguing that such code is likely already broken (producing mojibake) for
non-US-ASCII content, so having U+FFFD instead of mojibake is not worse. You
wouldn't be able to work around the issue by un-doing the improper encoding,
which means it would force users to fix their code.

Is it? I somehow suspect that there's a lot of code out there that does stuff like:


  string.indexOf('\xfc')   // search for ü

or similar.

(Usual disclaimer: not every developer is aware of encodings. Maybe they tried 'ü', and got a mysterious warning from the compiler, and the code didn't work; so they just put '\xfc' instead, and now it works -- ok, let's carry on.)

I'm not claiming that the situation is ideal, as we're clearly being inconsistent: `char` is being treated as UTF-8 or Latin1 depending on the context.

Yet, breaking a ~20 year behavior in "low-level code" is ... scary? It should require extraordinary motivation and care; we're probably talking about making 6.8->6.14 warn if someone passes a non-ASCII char to QASV/QChar(char)'s constructor, and change behavior to accept ASCII-only in 6.15?


Thanks,
--
Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - Trusted Software Excellence

smime.p7s
Description: Firma crittografica S/MIME

-- 
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

Re: [Development] Are char literals L1 or U8 in Qt?

Reply via email to