Hello,

Some time ago, I posted this bug report against Fedora 7 :
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=247393
and it seems that nothing happened...

I am posting to this ML because I qualify this bug as important since it
makes pdftotext completely useless with languages using accentuated
characters (like french in my case...).

Furthermore, pdftotext is currently used by Tracker
( http://www.gnome.org/projects/tracker/ ) to extract text from a PDF
file. Then, Tracker can index contain of the text file which is assumed
to be contain of the initial PDF file.
Since pdftotext destroys accentuated characters, Tracker do not
correctly index words and users cannot find them latter.

In my bug report, you will find a PDF file with accentuated characters +
a LaTeX file to reproduce another one. I also added what I obtained with
pdftotext.


I am not very interesting in installing Poppler 0.6.x (and to be honest
I am afraid about what such an install could break on my computer:
LaTeX ? Some PDF reader ? etc.) so I would like to know if this bug has
been fixed or to point it to Poppler developers otherwise.


Regards,
Laurent Aguerreche.

Attachment: signature.asc
Description: Ceci est une partie de message numériquement signée

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to