Jonathan Kew <[email protected]> writes:
> On 20 Dec 2011, at 00:56, Rupert Swarbrick wrote:
>> However, the Microchip datasheet on which I was testing my code still
>> fails weirdly. You can get it from [1] (not sure about whether I should
>> be posting it to a mailing list) and the first few lines I get look
>> like:
>> 
>> String length: 1541
>> Rectangles:    1477
>
> Guessing, without having looked at the code or API involved... is the
> "string length" here a count of UTF-8 _bytes_, but the "rectangles"
> are one per _character_? If so, you'd get a discrepancy as soon as
> non-ASCII characters (such as bullets, curly quotes, em-dashes,
> accented letters, etc, etc) are present.
>
> JK

That makes sense, but unfortunately I think I'm doing that right:

    printf ("String length: %u\nRectangles:    %u\n\n",
            g_utf8_strlen (text, -1), n_rects);

(the lisp code did that right first time. I corrected my strlen call to
g_utf8_strlen when I got a different answer from my C code!)

:-(

Rupert

Attachment: pgpxai7CIMuOs.pgp
Description: PGP signature

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to