Am 15.05.2019 um 16:00 schrieb Slava G:
But seems that in PDFBox 2.0.15 it's already fixed as, when I run tika-app
No it's not fixed. The cause is a corrupt ToUnicode stream. Fixed in https://issues.apache.org/jira/browse/PDFBOX-4550 Try a snapshot within a few hours https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.16-SNAPSHOT/ Tilman --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

