Re: Inquiry for an open PDFBox TODO in PDCIDFontType2.encode method

Tamas Kocsis Wed, 17 Feb 2021 05:34:50 -0800

It works!
TestFontEmbedding succeeded and I also executed my own test successfully.
There were some bumps, but no roadblocks :)


Thank you for your help Tilman - I really appreciate it!

On Tue, Feb 16, 2021 at 6:14 AM Tamas Kocsis <[email protected]>
wrote:

> Thank you!
> I'll give it a try and let you know.
>
> On Mon, Feb 15, 2021 at 6:06 PM Tilman Hausherr <[email protected]>
> wrote:
>
>> Am 15.02.2021 um 10:32 schrieb Tamas Kocsis:
>> > Thanks for the info and for looking into it.
>> > Never tried building PDFBox from source, but I guess I could do it.
>> Would
>> > be nice if I could test this with 2.0...
>>
>> OK here's some code. If you can't get it run (don't waste too much time
>> if you hit roadblocks) then I'll create an issue and commit and build a
>> snapshot.
>>
>> PDFont.java:
>>
>>
>>
>>      /**
>>       * Get the /ToUnicode CMap.
>>       *
>>       * @return The /ToUnicode CMap or null if there is none.
>>       */
>>      protected CMap getToUnicodeCMap()
>>      {
>>          return toUnicodeCMap;
>>      }
>>
>> PDCIDFontType2.java:
>>
>> add this at the place mentioned in your first post
>>
>>                  byte[] codes =
>>
>> parent.getToUnicodeCMap().getCodesFromUnicode(Character.toString((char)unicode));
>>                  if (codes != null)
>>                  {
>>                      return codes;
>>                  }
>>
>>
>> in CMap.java
>>
>>
>> add
>>
>> unicodeToByteCodes.put(unicode, codes.clone()); // clone needed, bytes
>> is modified later
>>
>> as first line of the method addCharMapping()
>>
>>
>> also add these in the clas
>>
>>      // inverted map
>>      Map <String, byte[]> unicodeToByteCodes = new HashMap<String,
>> byte[]>();
>>
>>
>>      /**
>>       * Get the code bytes for an unicode string.
>>       *
>>       * @param unicode
>>       * @return the code bytes or null if there is none.
>>       */
>>      public byte[] getCodesFromUnicode(String unicode)
>>      {
>>          return unicodeToByteCodes.get(unicode);
>>      }
>>
>>
>> and a test, for TestFontEmbedding.java . If the test runs then you're
>> successful
>>
>>
>>
>>      /**
>>       * Test that an embedded and subsetted font can be reused.
>>       *
>>       * @throws IOException
>>       */
>>      public void testReuseEmbeddedSubsettedFont() throws IOException
>>      {
>>          String text1 = "The quick brown fox";
>>          String text2 = "xof nworb kciuq ehT";
>>          ByteArrayOutputStream baos = new ByteArrayOutputStream();
>>          PDDocument document = new PDDocument();
>>          PDPage page = new PDPage();
>>          document.addPage(page);
>>          InputStream input = PDFont.class.getResourceAsStream(
>> "/org/apache/pdfbox/resources/ttf/LiberationSans-Regular.ttf");
>>          PDType0Font font = PDType0Font.load(document, input);
>>          PDPageContentStream stream = new PDPageContentStream(document,
>> page);
>>          stream.beginText();
>>          stream.setFont(font, 20);
>>          stream.newLineAtOffset(50, 600);
>>          stream.showText(text1);
>>          stream.endText();
>>          stream.close();
>>          document.save(baos);
>>          document.close();
>>          // Append, while reusing the font subset
>>          document = PDDocument.load(baos.toByteArray());
>>          page = document.getPage(0);
>>          font = (PDType0Font)
>> page.getResources().getFont(COSName.getPDFName("F1"));
>>          stream = new PDPageContentStream(document, page,
>> PDPageContentStream.AppendMode.APPEND, true);
>>          stream.beginText();
>>          stream.setFont(font, 20);
>>          stream.newLineAtOffset(250, 600);
>>          stream.showText(text2);
>>          stream.endText();
>>          stream.close();
>>          baos.reset();
>>          document.save(baos);
>>          document.close();
>>          // Test that both texts are there
>>          document = PDDocument.load(baos.toByteArray());
>>          PDFTextStripper stripper = new PDFTextStripper();
>>          String extractedText = stripper.getText(document);
>>          assertEquals(text1 + " " + text2, extractedText.trim());
>>          document.close();
>>      }
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>

Re: Inquiry for an open PDFBox TODO in PDCIDFontType2.encode method

Reply via email to