On 18/07/10 16:23, Bruno Haible wrote:
> Hi Pádraig,
> 
>> However, the first byte of a multibyte
>> UTF-8 char is the same for a lot of characters
> 
> Yes. The last byte is equidistributed across the range 0x80..0xBF, whereas
> the first byte is often the same. I'm applying the commit below to exploit it
> for speed.

Nice one Bruno.
Testing the interesting 2 and 3 byte cases shows an improvement
of 10 and 15% respectively.

cheers,
Pádraig.

Reply via email to