Context: if you're reading this out of context, here's the context: we're
changing the default encoding for QString's methods that deal with 8-bit data.
In Qt 3 and 4, it was a variable encoding and defaulted to Latin 1 (set with
QTextCodec::setCodecsForCStrings). In Qt 5, the variability was removed,
leaving fromAscii == fromLatin1. We're NOT changing how QString internally
stores data, that will remain UTF-16.

A number of commits have been accepted into Qt 5 that dealt with the encoding
of source files. I think I caught all source code that contained non-7-bit
characters and reencoded them to UTF-8. There are surprisingly few in Qt. I've
also wrapped all uses of the "ascii" functions that contained Latin1 data with
QString::fromLatin1.

The following two pending commit changes the QString 8-bit functions to use
UTF-8, by *temporarily* changing fromAscii to mean fromUtf8, and toAscii to
mean toUtf8.
        https://codereview.qt-project.org/24700
        https://codereview.qt-project.org/24701
        tests: https://codereview.qt-project.org/24702

They have been tested in qtbase and no regressions have been found. I do not
believe they should cause regressions in other modules.

I'm now testing a series of changes that change fromAscii to fromUtf8, as well
as correct one or two encoding mistakes I think I've found. Since fromAscii 
=fromUtf8 at this point in the test, the change is technically a no-op and I
expect no regressions at all. Those changes are done for the few places in the
code where the data seemed to be non-Latin1 in origin, as well as QString
itself.

Next, I'll change all remaining fromAscii to fromLatin1 and toAscii to
toLatin1. Since that's what those functions were before (still are right now
in qtbase master), I also expect no regressions. Then I'll deprecate the Ascii
functions.

Finally, probably starting two weeks from now when I'm back from the US, I'll
start benchmarking and optimising the fromUtf8 function, as well as merging
the many UTF-8 encoders and decoders in Qt (yes, we have more than one).

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Reply via email to