Thanks for the feedback! It has been fixed here:
https://issues.apache.org/jira/browse/PDFBOX-5103
There is also a snapshot build here
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.23-SNAPSHOT/
Tilman
Am 17.02.2021 um 14:34 schrieb Tamas Kocsis:
It works!
TestFontEmbedding succeeded and I also executed my own test successfully.
There were some bumps, but no roadblocks :)
Thank you for your help Tilman - I really appreciate it!
On Tue, Feb 16, 2021 at 6:14 AM Tamas Kocsis <[email protected]>
wrote:
Thank you!
I'll give it a try and let you know.
On Mon, Feb 15, 2021 at 6:06 PM Tilman Hausherr <[email protected]>
wrote:
Am 15.02.2021 um 10:32 schrieb Tamas Kocsis:
Thanks for the info and for looking into it.
Never tried building PDFBox from source, but I guess I could do it.
Would
be nice if I could test this with 2.0...
OK here's some code. If you can't get it run (don't waste too much time
if you hit roadblocks) then I'll create an issue and commit and build a
snapshot.
PDFont.java:
/**
* Get the /ToUnicode CMap.
*
* @return The /ToUnicode CMap or null if there is none.
*/
protected CMap getToUnicodeCMap()
{
return toUnicodeCMap;
}
PDCIDFontType2.java:
add this at the place mentioned in your first post
byte[] codes =
parent.getToUnicodeCMap().getCodesFromUnicode(Character.toString((char)unicode));
if (codes != null)
{
return codes;
}
in CMap.java
add
unicodeToByteCodes.put(unicode, codes.clone()); // clone needed, bytes
is modified later
as first line of the method addCharMapping()
also add these in the clas
// inverted map
Map <String, byte[]> unicodeToByteCodes = new HashMap<String,
byte[]>();
/**
* Get the code bytes for an unicode string.
*
* @param unicode
* @return the code bytes or null if there is none.
*/
public byte[] getCodesFromUnicode(String unicode)
{
return unicodeToByteCodes.get(unicode);
}
and a test, for TestFontEmbedding.java . If the test runs then you're
successful
/**
* Test that an embedded and subsetted font can be reused.
*
* @throws IOException
*/
public void testReuseEmbeddedSubsettedFont() throws IOException
{
String text1 = "The quick brown fox";
String text2 = "xof nworb kciuq ehT";
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
InputStream input = PDFont.class.getResourceAsStream(
"/org/apache/pdfbox/resources/ttf/LiberationSans-Regular.ttf");
PDType0Font font = PDType0Font.load(document, input);
PDPageContentStream stream = new PDPageContentStream(document,
page);
stream.beginText();
stream.setFont(font, 20);
stream.newLineAtOffset(50, 600);
stream.showText(text1);
stream.endText();
stream.close();
document.save(baos);
document.close();
// Append, while reusing the font subset
document = PDDocument.load(baos.toByteArray());
page = document.getPage(0);
font = (PDType0Font)
page.getResources().getFont(COSName.getPDFName("F1"));
stream = new PDPageContentStream(document, page,
PDPageContentStream.AppendMode.APPEND, true);
stream.beginText();
stream.setFont(font, 20);
stream.newLineAtOffset(250, 600);
stream.showText(text2);
stream.endText();
stream.close();
baos.reset();
document.save(baos);
document.close();
// Test that both texts are there
document = PDDocument.load(baos.toByteArray());
PDFTextStripper stripper = new PDFTextStripper();
String extractedText = stripper.getText(document);
assertEquals(text1 + " " + text2, extractedText.trim());
document.close();
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]