Package: poppler-utils Version: 0.26.5-2 Severity: normal pdftotext has improved to generate ASCII instead of ligatures (like ff) but it still generates ligatures in some cases. For instance, if I run pdflatex on the following file:
------------------------------------------------------------ \documentclass[11pt]{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{lmodern} \begin{document} \thispagestyle{empty} chiffre ê \end{document} ------------------------------------------------------------ I get some ff-lig1.pdf PDF file (attached), and running pdftotext on it gives: chiffre ê This is OK. But if I run ps2pdf on this PDF file, I get some other ff-lig1-gs.pdf PDF file (attached), and running pdftotext on it gives: chiffre ê i.e. with a ligature, which makes searching text such as "chiffre" unpredictable (the output is also less readable in a terminal with a monospace font). This problem doesn't occur if I replace "ê" by "é" in the LaTeX file! Note that xpdf finds "chiffre" in both cases, so that the bug seems to be in pdftotext. Moreover pdftotext from poppler-utils 0.18.4-6 (wheezy) gives the ligature on the 4 attached PDF files, so that it seems that pdftotext has been improved except in the particular case mentioned above. -- System Information: Debian Release: 8.0 APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.16.0-4-amd64 (SMP w/8 CPU cores) Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: sysvinit (via /sbin/init) Versions of packages poppler-utils depends on: ii libc6 2.19-13 ii libcairo2 1.14.0-2.1 ii libfreetype6 2.5.2-2 ii liblcms2-2 2.6-3+b3 ii libpoppler46 0.26.5-2 ii libstdc++6 4.9.2-5 ii zlib1g 1:1.2.8.dfsg-2+b1 poppler-utils recommends no packages. poppler-utils suggests no packages. -- no debconf information
ff-lig1.pdf
Description: Adobe PDF document
ff-lig1-gs.pdf
Description: Adobe PDF document
ff-lig2.pdf
Description: Adobe PDF document
ff-lig2-gs.pdf
Description: Adobe PDF document