Jonathan Kew <[email protected]> writes: > On 20 Dec 2011, at 00:56, Rupert Swarbrick wrote: >> However, the Microchip datasheet on which I was testing my code still >> fails weirdly. You can get it from [1] (not sure about whether I should >> be posting it to a mailing list) and the first few lines I get look >> like: >> >> String length: 1541 >> Rectangles: 1477 > > Guessing, without having looked at the code or API involved... is the > "string length" here a count of UTF-8 _bytes_, but the "rectangles" > are one per _character_? If so, you'd get a discrepancy as soon as > non-ASCII characters (such as bullets, curly quotes, em-dashes, > accented letters, etc, etc) are present. > > JK
That makes sense, but unfortunately I think I'm doing that right:
printf ("String length: %u\nRectangles: %u\n\n",
g_utf8_strlen (text, -1), n_rects);
(the lisp code did that right first time. I corrected my strlen call to
g_utf8_strlen when I got a different answer from my C code!)
:-(
Rupert
pgpxai7CIMuOs.pgp
Description: PGP signature
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
