Package: poppler-utils Version: 22.12.0-2+b1 Severity: minor X-Debbugs-Cc: debbug.poppler-ut...@sideload.33mail.com
The pdftocairo man page starts with: > NAME > pdftocairo - Portable Document Format (PDF) to > PNG/JPEG/TIFF/PDF/PS/EPS/SVG using cairo > SYNOPSIS > pdftocairo [options] PDF-file [output-file] > DESCRIPTION > pdftocairo converts Portable Document Format (PDF) files … Bug ①: That BNF tells the user that they can simply run pdftocairo on a PDF doc with no options (as the square brackets imply that the token is not required). This immediately leaves the user wondering what effect that would have. In reality, pdftocairo terminates with an error. So the BNF needs to be fixed. Bug ②: It’s not obvious from the man page that quality will be altered. Consider the extraction_works.pdf sample that was attached to: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076283 The PDF is 2.1mb. When pdfimages extracts the PNG file (which involves no manipulation), the resulting PNG is about the same size as the only difference is metadata and overhead. But when “pdftocairo -png” is used, the output PNG is about half the size: ===8<------------------------------ $ identify pdfimages_extraction_works-000.png pdfimages_extraction_works-000.png PNG 2550x2452 2550x2452+0+0 8-bit sRGB 2130440B 0.000u 0:00.000 $ identify cairo_extraction_works-1.png cairo_extraction_works-1.png PNG 1275x2100 1275x2100+0+0 8-bit sRGB 1312420B 0.000u 0:00.000 ===8<------------------------------ The resolution was cut in half. Why is that? Some would say this fails the principle of least astonishment. Is it that the PDF metadata includes some parameters about paper size and resolution, and the embedded image was much larger than necessary for the PDF’s rendering? The man page says this: > The image dimensions will depend on the PDF page size and the > resolution. That’s clear to careful and meticulous readers but a bit subtle, no? I think misunderstanding by users can in part be attributed to the mention of “converts” in the description: “pdftocairo converts Portable Document Format (PDF) files”. Mere conversion does not lead the user to expect manipulation. If it would say something like: “pdftocairo RENDERS output images with the size properties specified by the PDF page spec… Resolution does not necessarily match that of the source images contained in the PDF and may be increased or decreased.” it might be more clear to users what to expect. Perhaps even better, it would also be extra helpful if the output text would inform the user of what happened. E.g. “output image resolution was decreased by 51% on image 1 page 1, 26% on image 2 page 1, 35% on image 1 page 2, …” etc. It is indeed useful that it generates output that matches the render quality specified by the PDF. This enables us to repackage a PDF for transmission such that size is not wasteful for a given quality. But users could be made more clearly aware of that. ③ (enhancement) It might also be useful if users could specify a “maintain source quality” option, whereby the output preserves the internal image parameters. Though I hesitate to suggest this because I realize that would only be sensible in situations where each page contains exactly one image to consumes the whole page (like a scanned doc). Nonetheless, I thought it would be worth mentioning. -- System Information: Debian Release: 12.6 APT prefers stable-updates APT policy: (990, 'stable-updates'), (990, 'stable-security'), (990, 'stable'), (500, 'oldstable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 5.10.0-28-amd64 (SMP w/2 CPU threads) Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages poppler-utils depends on: ii libc6 2.36-9+deb12u7 ii libcairo2 1.16.0-7 ii libfreetype6 2.12.1+dfsg-5+deb12u3 ii liblcms2-2 2.14-2 ii libpoppler126 22.12.0-2+b1 ii libstdc++6 12.2.0-14 poppler-utils recommends no packages. poppler-utils suggests no packages. -- no debconf information