Public bug reported: pdftotext -htmlmeta does miss metadata from PDF catalog. pdfinfo does output all values known:
e.g. a pdfinfo output: Title: Titel Author: Word Creator: WordToPDF 2.4 build 127 Producer: AFPL Ghostscript 8.54 CreationDate: Fri Jul 2 09:14:02 2007 ModDate: Fri Jul 2 09:14:02 2007 Tagged: no Pages: 6 Encrypted: no Page size: 595 x 842 pts (A4) File size: 104664 bytes Optimized: no PDF version: 1.3 in contrast the meta section of the pdftotext -htmlmeta output: <head> <title>Titel</title> <meta name="Author" content="Word"/> <meta name="Creator" content="WordToPDF 2.4 build 127"/> <meta name="Producer" content="AFPL Ghostscript 8.54"/> <meta name="CreationDate" content=""/> </head> Does not match and miss some meta data. ProblemType: Bug DistroRelease: Ubuntu 11.10 Package: poppler-utils 0.16.7-2ubuntu2 Uname: Linux 3.3.3-030303-generic x86_64 NonfreeKernelModules: vboxpci vboxnetadp vboxnetflt vboxdrv ApportVersion: 1.23-0ubuntu4 Architecture: amd64 Date: Wed May 2 15:44:06 2012 InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64+mac (20110427.1) ProcEnviron: LANGUAGE=en_US.UTF-8 PATH=(custom, user) LANG=de_DE.UTF-8 SHELL=/bin/bash SourcePackage: poppler UpgradeStatus: Upgraded to oneiric on 2012-02-16 (76 days ago) ** Affects: poppler (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug oneiric -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/993292 Title: pdftotext -htmlmeta does output incomplete metadata, pdfinfo outputs them all To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/993292/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs