Hi Kanru, thank you for looking into this. As you may have noticed, this kind of PDF files are produced from a german bank for their account statements. Their argument for not looking further into this is the normal "Windows doesn't have this problem".
Kan-Ru Chen (陳侃如) wrote on 10/03/2014 19:45: > Hi, > > Jörg-Volker Peetz <jvpe...@web.de> writes: > >> Package: mupdf >> Version: 1.5-1+b1 >> Severity: normal >> >> Dear Kan-Ru Chen, >> >> the problem occurs for a special pdf-file (generated by iText v 2.0.8 on >> a windows system, I suppose). I've attached the file. Searching for the >> word "monat" does not find all occurrences of the word, but searching >> for "onat" does. The pdf-file is displayed correctly, only searching >> (and extracting the text) fails. It's a strange problem which, I have >> to admit, also occurs with the poppler derived viewers and in >> iceweasel. The only common library used by these tools is libfreetype6. > > I think the PDF file contains a incorrect /ToUnicode CMap which maps 'M' > to 'j'. You could try to search "jonat" which will match the "monat" > glyphs. > Can you tell me which part, which library is interpreting this /ToUnicode CMap? >> Under windows the search in Acrobat-reader works. > > I'm not sure how Acrobat-reader do that. > >> Do you have any idea what may be the problem? >> Feel free to close the bug or re-assign it to another package. > > Maybe the pdf-file generating process has issues. > > Kanru > Best regards, Jörg-Volker. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org