Hi,

If you tried with RC1 - yes, there were many issues about the font height and size. And we had a type 3 font bug that applies to many files. So it may have been fixed already.

If not, as Maruan said, please open an issue. If you have many problems, start with one single file that seem to be the worst.

Tilman

Am 26.10.2015 um 06:36 schrieb Joel Hirsh:
I am trying to get the size of text (i.e fontsize).  In version 1.8, the
height of text was somewhat inconsistent, and not there for type 3 fonts,
but I thought that was supposed to be all sorted out in v2.0.  But version
2 seems to be even more inconsistent than version 1.8.

I am using PDFTextStripper and reading the TextPosition array that comes
with each String.  I have tried getHeight(), getFontSize(),
getFontSizeInPt(), getYScale, and none of them are dependable for a useful
answer.  They are consistent within a file, but useless for checking if a
particular string contains readable size text.

Which one of these TextPosition values should be used for this purpose
And then do I report bugs on all the files that don't give correct results?

FYI - I ran a test with version 2 against 100+ PDF files that come from
different sources, and use a mixture of TrueType, Type 0, Type1, Type3
fonts.  All of these have text that is font size 8-12pt, as reported by
Acrobat.  I dumped the size values returned for digit strings in the files
(i.e 12345), so that everything should be a full height string.

The reported height of text mostly ranged from 2.3 to 7.5 (although one
very readable file reported a height of 0).  I examined a few files with
Acrobat and the files with reported text height of 2.3  and 7.5 both had
9pt fonts.  But the other values from TextPosition were worse. The fontsize
was a plausible value for only about half of these files, seemed
particularly bad on TrueTypeFont's.  The fontsize values ranged from 1 to
200.  The fontsizeinpt values seemed mostly to be a multiple of fontsize,
but even that was inconsistent, often it seems to be the square of the
fontsize (like a fontsize of 58 and a fontsizeinpt of 3364), but sometimes
simply a multiple of 10.

The most accurate value I could find in the TextPosition was getYScale(),
which had a plausible value about 90% of the time.  But on type3 fonts, it
too was inconsistent, often returning values of 1, but also values up to 27.

So how should I be finding out the height of text??



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to