Matthew GAO created PDFBOX-2663:
-----------------------------------
Summary: CMap handling bug
Key: PDFBOX-2663
URL: https://issues.apache.org/jira/browse/PDFBOX-2663
Project: PDFBox
Issue Type: Bug
Components: FontBox
Affects Versions: 1.8.8
Environment: Windows 7
Reporter: Matthew GAO
Fix For: 1.8.8
Some CMap includes another CMap using the "usecmap" command in the CMap
resource file. For example, "ETenms-B5-H" includes "ETen-B5-H" CMap. The
command in resource file is "/ETen-B5-H usecmap"
The CMapParser does handle this case. Please find the coding below:
if (op.op.equals(USECMAP))
{
LiteralName useCmapName = (LiteralName) previousToken;
InputStream useStream =
ResourceLoader.loadResource(resourceRoot + useCmapName.name);
if (useStream == null)
{
throw new IOException("Error: Could not find referenced
cmap stream " + useCmapName.name);
}
CMap useCMap = parse(resourceRoot, useStream);
result.useCmap(useCMap);
}
But the useCmap method of CMap class doesn't copy the cidRanges list from the
child CMap. Please find the coding below:
public void useCmap( CMap cmap )
{
this.codeSpaceRanges.addAll( cmap.codeSpaceRanges );
this.singleByteMappings.putAll( cmap.singleByteMappings );
this.doubleByteMappings.putAll( cmap.doubleByteMappings );
}
Without the cidRanges from the included CMap, PDFBox doesn't know a text can be
solved by the CMap and return "?" finally.
Suggest to add the following coding to useCmap method of CMap class to solve
the problem.
this.cidRanges.addAll( cmap.cidRanges);
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]