Yes your suggestion makes sense. (The images can't be seen here but I
saw them in moderation). Please create an issue in JIRA. Just use the
text here.
Tilman
Am 20.09.2021 um 18:10 schrieb Fernando Sadu:
Component: PDFRenderer/Type1Parser
Affects Version/s: 2.0.23
Environment: Java 8
*Description:*
When I try to convert a pdf page using
"pdfRenderer.renderImageWithDPI" to a png image using pdfbox
version 2.023 I get the following error. The pdf is a customerspecific
one, I can't share the original file here.
Error [PDType1Font] Can't read the embedded Type1 font
AAAAAB+NimbusMonoPS-Regular_00
java.io.IOException: Found Token[kind=NAME, text=readonly] but
expected def
at org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:867)
at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:610)
at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:64)
at
org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:85)
at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:263)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:76)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146)
Using the PDFDebugger to have access to the FontFile of this page
throwing the error I "Save the Stream as PFB" and using the t1utils
<https://formulae.brew.sh/formula/t1utils> tools, using t1disasm I can
obtain the readable equivalent of the binary private dictionary, which
looks like this:
image.png
And following the trace in PDFBox I can see it blowing up at
"Type1Parser.class" in the "parseBinary" method at line 602 they have
the check for ""RD".equals(key)", where key is "readonly" and the last
check from that list "read(Token.*/NAME/*, "def");" has to be def, and
because the key is "readonly" it throws the error, even if it makes it
pass "RD" it will have the same results when parsing the contents for
"ND" and "NP".
I have seen most of the Type 1 Font files have "executeonly" instead
of "readonly" in the /Private dict section, even this same file if I
convert it first to PS and back to PDF, extracting the font again I
can see that the instructions for RD, ND, and NP are rearranged to be
"executeonly" using Mac Preview or GS ps2pdf., and the PNG is
generated without issue using the PDFRenderer, I don't see a way how
to do this step programmatically at the moment.
image.png
From the Type 1 Font Spec, they don't provide a must follow receipt on
what instructions can be appended after RD, ND, or NP, they
state: "The RD, NP, and ND functions must be implemented by PostScript
language
procedures" , If we take a look to the PostScript Language Reference
Manual:
*readonly*: When an object is read-only, its value cannot be modified
by PostScript operators (an invalidaccess error will result), but it
can still be read by operators or executed by the PostScript interpreter.
*executeonly*: When an object is execute-only, its value cannot be
read or modified explicitly by PostScript operators (an invalidaccess
error will result), but it can still be executed by the PostScript
interpreter—for example, by invoking it with exec
Both instructions "*readonly" *and "*executeonly" *allows the
instructions to be executed.
Questions:
1. Would it be possible to add an optional "readMaybe(Token.*/NAME/*,
"readonly");" to the "parseBinary" RD, "ND", and "NP" keywords
similar to how it was done at PDFBOX-2202
<https://issues.apache.org/jira/browse/PDFBOX-2202> ? checking PDFBox
latest version 3.0 RC "Type1Parser.class" method still the same with
no optional "readonly".
2. If this is not a valid constructed FontFile, and I'm not the one
creating the original pdf, neither deciding which font to embed, could
you please suggest an alternative to deal with this case?
Thank you.