Le dimanche 19 août 2007 à 16:56 +0200, Albert Astals Cid a écrit : > --- Laurent Aguerreche escribió: > > > Hello, > > Hi
Hi, > > > Some time ago, I posted this bug report against > > Fedora 7 : > > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=247393 > > and it seems that nothing happened... > > You don't expect us to have a look at red hat > bugzilla, don't you? ;-) > > > I am posting to this ML because I qualify this bug > > as important since it > > makes pdftotext completely useless with languages > > using accentuated > > characters (like french in my case...). > > Good move, but for bugs it's better to use poppler > bugzilla at bugs.freedesktop.org Yes, you're completely right! Sorry... :-/ > > Furthermore, pdftotext is currently used by Tracker > > ( http://www.gnome.org/projects/tracker/ ) to > > extract text from a PDF > > file. Then, Tracker can index contain of the text > > file which is assumed > > to be contain of the initial PDF file. > > Since pdftotext destroys accentuated characters, > > Tracker do not > > correctly index words and users cannot find them > > latter. > > > > In my bug report, you will find a PDF file with > > accentuated characters + > > a LaTeX file to reproduce another one. I also added > > what I obtained with > > pdftotext. > > > > > > I am not very interesting in installing Poppler > > 0.6.x (and to be honest > > I am afraid about what such an install could break > > on my computer: > > LaTeX ? Some PDF reader ? etc.) so I would like to > > know if this bug has > > been fixed or to point it to Poppler developers > > otherwise. > > No, it has not been fixed and will not be fixed > because it is not a bug in poppler. poppler handles > accentuated characters without any problem, the > problem you are facing is that the program you are > using to generate the pdf is not generating the pdf > "correctly" so that text extraction is possible. > > You can write your very same demonstration text in > oowriter, export to pdf from inside oowriter and see > that pdftotext generates a correct output. Accents are correctly handled, that's right (but spaces are all replaced with "unbreakable" spaces!). > You can them open your latex pdf in acrobat reader and > see it can neither handle the accents correctly. Hum... That's wrong. My latex-generated PDF is perfectly opened with acroread, evince, kpdf and xpdf. Why?! > So blame latex, not poppler. Ok but if you know the problem, are latex developers aware too? Do you know whether it is fixable? Thanks for your answer, Laurent. > Albert > > > > > > > Regards, > > Laurent Aguerreche. > > > _______________________________________________ > > poppler mailing list > > [email protected] > > > http://lists.freedesktop.org/mailman/listinfo/poppler > > > > > > > ____________________________________________________________________________________ > Sé un Mejor Amante del Cine > ¿Quieres saber cómo? ¡Deja que otras personas te ayuden! > http://advision.webevents.yahoo.com/reto/entretenimiento.html
signature.asc
Description: Ceci est une partie de message numériquement signée
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
