On 21/09/2021 13:34, Heinz-Jürgen Oertel wrote:
I did some more research. The result, it's not "pdfinfo" it is Imagemagick "convert". I mostly use jpg file converted to pdf by "convert".
Since your graphic originates as JPG, is there any particular reason why you cannot convert to EPS, and use .PSPIC to import it into groff? That way you would be using groff's built-in .psbb request, so no potentially unsafe call-out to pdfinfo is required, to get the bounding box.
The example file "Selz.pdf" % pdfinfo Selz.pdf | hexdump -xc 0000000 6954 6c74 3a65 2020 2020 2020 2020 2020 0000000 T i t l e : 0000010 5300 6500 6c00 7a00 0000 410a 7475 6f68
Looks like UTF-16 creeping into what is otherwise a UTF-8 (or ASCII) data stream.
0000010 \0 S \0 e \0 l \0 z \0 \0 \n A u t h o 0000020 3a72 2020 2020 2020 2020 6820 7474 7370 0000020 r : h t t p s ... as one can see, there are \0 chars already in the title. Looking at the PDF: /Title <00530065006C007A0000>
So, here the title is encoded as an ASCII hex-digit representation of UTF-16LE text. IIRC, that's a valid PDF encoding, but why is pdfinfo not decoding it in a format which is consistent with the rest of its output? Looks like a pdfinfo bug, to me. -- Cheers, Keith