A Dilluns 20 Agost 2007, Carl Worth va escriure:
> On Sun, 19 Aug 2007 22:46:16 +0200, Laurent Aguerreche wrote:
> > But the real problem is that it is impossible to recognize :
> > - "fi" as "fi" too
> > - "ff" as "ff" too
> > Would it be possible to add a new parameter to pdftotext to make it
> > ignore ligatures but still export in UTF-8?
>
> It's quite preferable to have the ligatures in your PDF file.
>
> The bug to fix is that poppler should expand the ligatures to their
> normalized forms when extracting the text.

Actually i disagree, if you have æ do you want to get it expanded to ae too? 
If not why you want it with the ff ligature?

Albert

>
> That bug was first reported here:
>
>       Text extraction should expand ligatures to their normal form
>       https://bugs.freedesktop.org/show_bug.cgi?id=7002
>
> -Carl


_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to