I think the question was about the performance impact of using UTF-16 as
an internal representation of characters.

The original claim was in effect that the encoding conversion to UTF-16
is so costly that it offsets any gain of doing codepoint operations on
UTF-16 instead of UTF-8.

It is a very strong claim because experiments so far have proven the
opposite. I think the statement against ICU/UTF16 needs to be backed by
experimental data.

Benjamin

On 10/6/13, 12:31 PM, Alp Toker wrote:
> Geoffrey, http://userguide.icu-project.org/conversion/converters says:
> 
> "Since ICU uses Unicode (UTF-16) internally, all converters convert
> between UTF-16 (with the endianness according to the current platform)
> and another encoding."
> 
> That said, I don't think it's a major concern because ICU works on byte
> streams. It's not like these strings will persist internally somewhere
> eating lots of memory.
> 
> From experience, the old WTF in-place converters found in WebKit
> "mobile" ports of past were way-buggy and probably only ever tested with
> ASCII. I'd say use ICU and don't look back :-)
> 
> Alp.
> 
> 
> On 06/10/2013 20:08, Geoffrey Garen wrote:
>>> There is an issue with ICU: it uses UTF16 as its internal representation, 
>>> while most of the Web nowadays is UTF8. Therefore, page text goes through 
>>> unnecessary encoding conversion, and takes more memory than in UTF8 (for 
>>> most of languages). So it might be not a good development direction to tie 
>>> up WebKit to ICU.
>> Is there a benchmark or website that can verify these claims?
>>
>> Thanks,
>> Geoff
>> _______________________________________________
>> webkit-dev mailing list
>> [email protected]
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
> 

_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to