Re: Inquiry for an open PDFBox TODO in PDCIDFontType2.encode method

Tilman Hausherr Wed, 17 Feb 2021 09:04:05 -0800

Thanks for the feedback! It has been fixed here:
https://issues.apache.org/jira/browse/PDFBOX-5103


There is also a snapshot build here
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.23-SNAPSHOT/


Tilman


Am 17.02.2021 um 14:34 schrieb Tamas Kocsis:

It works!
TestFontEmbedding succeeded and I also executed my own test successfully.
There were some bumps, but no roadblocks :)

Thank you for your help Tilman - I really appreciate it!

On Tue, Feb 16, 2021 at 6:14 AM Tamas Kocsis <[email protected]>
wrote:

Thank you!
I'll give it a try and let you know.

On Mon, Feb 15, 2021 at 6:06 PM Tilman Hausherr <[email protected]>
wrote:

Am 15.02.2021 um 10:32 schrieb Tamas Kocsis:

Thanks for the info and for looking into it.
Never tried building PDFBox from source, but I guess I could do it.

Would

be nice if I could test this with 2.0...

OK here's some code. If you can't get it run (don't waste too much time
if you hit roadblocks) then I'll create an issue and commit and build a
snapshot.

PDFont.java:



      /**
       * Get the /ToUnicode CMap.
       *
       * @return The /ToUnicode CMap or null if there is none.
       */
      protected CMap getToUnicodeCMap()
      {
          return toUnicodeCMap;
      }

PDCIDFontType2.java:

add this at the place mentioned in your first post

                  byte[] codes =

parent.getToUnicodeCMap().getCodesFromUnicode(Character.toString((char)unicode));
                  if (codes != null)
                  {
                      return codes;
                  }


in CMap.java


add

unicodeToByteCodes.put(unicode, codes.clone()); // clone needed, bytes
is modified later

as first line of the method addCharMapping()


also add these in the clas

      // inverted map
      Map <String, byte[]> unicodeToByteCodes = new HashMap<String,
byte[]>();


      /**
       * Get the code bytes for an unicode string.
       *
       * @param unicode
       * @return the code bytes or null if there is none.
       */
      public byte[] getCodesFromUnicode(String unicode)
      {
          return unicodeToByteCodes.get(unicode);
      }


and a test, for TestFontEmbedding.java . If the test runs then you're
successful



      /**
       * Test that an embedded and subsetted font can be reused.
       *
       * @throws IOException
       */
      public void testReuseEmbeddedSubsettedFont() throws IOException
      {
          String text1 = "The quick brown fox";
          String text2 = "xof nworb kciuq ehT";
          ByteArrayOutputStream baos = new ByteArrayOutputStream();
          PDDocument document = new PDDocument();
          PDPage page = new PDPage();
          document.addPage(page);
          InputStream input = PDFont.class.getResourceAsStream(
"/org/apache/pdfbox/resources/ttf/LiberationSans-Regular.ttf");
          PDType0Font font = PDType0Font.load(document, input);
          PDPageContentStream stream = new PDPageContentStream(document,
page);
          stream.beginText();
          stream.setFont(font, 20);
          stream.newLineAtOffset(50, 600);
          stream.showText(text1);
          stream.endText();
          stream.close();
          document.save(baos);
          document.close();
          // Append, while reusing the font subset
          document = PDDocument.load(baos.toByteArray());
          page = document.getPage(0);
          font = (PDType0Font)
page.getResources().getFont(COSName.getPDFName("F1"));
          stream = new PDPageContentStream(document, page,
PDPageContentStream.AppendMode.APPEND, true);
          stream.beginText();
          stream.setFont(font, 20);
          stream.newLineAtOffset(250, 600);
          stream.showText(text2);
          stream.endText();
          stream.close();
          baos.reset();
          document.save(baos);
          document.close();
          // Test that both texts are there
          document = PDDocument.load(baos.toByteArray());
          PDFTextStripper stripper = new PDFTextStripper();
          String extractedText = stripper.getText(document);
          assertEquals(text1 + " " + text2, extractedText.trim());
          document.close();
      }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Inquiry for an open PDFBox TODO in PDCIDFontType2.encode method

Reply via email to