2007/8/19, Laurent Aguerreche <[EMAIL PROTECTED]>: > Le dimanche 19 août 2007 à 22:34 +0200, Martin Schröder a écrit : > > It's a ligature. It's a feature. :-) > > :-/ > > So with DéjàVu fonts and "ff" character, it looks rather ugly and this > character is not displayed by emacs22 (just an empty rectangle). \o/ > > But the real problem is that it is impossible to recognize : > - "fi" as "fi" too > - "ff" as "ff" too > Would it be possible to add a new parameter to pdftotext to make it > ignore ligatures but still export in UTF-8?
pdftex can since 1.30.0 disable all ligatures for a font with \pdfnoligatures. But this produces inferior typesetting and no, there is no switch to disable ligatures for all fonts. But it should be easy to convert "ff" to "ff" with the help of sed/awk/..., i.e. massaging the output of pdftotext. Best Martin _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
