A Dilluns 20 Agost 2007, Carl Worth va escriure: > On Sun, 19 Aug 2007 22:46:16 +0200, Laurent Aguerreche wrote: > > But the real problem is that it is impossible to recognize : > > - "fi" as "fi" too > > - "ff" as "ff" too > > Would it be possible to add a new parameter to pdftotext to make it > > ignore ligatures but still export in UTF-8? > > It's quite preferable to have the ligatures in your PDF file. > > The bug to fix is that poppler should expand the ligatures to their > normalized forms when extracting the text.
Actually i disagree, if you have æ do you want to get it expanded to ae too? If not why you want it with the ff ligature? Albert > > That bug was first reported here: > > Text extraction should expand ligatures to their normal form > https://bugs.freedesktop.org/show_bug.cgi?id=7002 > > -Carl _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
