Dear list,

I use the Python bindings for Poppler (through GObject introspection) to
extract some metadata from PDF documents.

Here is a minimal script:


  import sys

  import os

  import gi

  gi.require_version('Poppler', '0.18')

  from gi.repository import Poppler

  gi.require_version('Gst', '1.0')

  from gi.repository import Gst

  Gst.init(sys.argv)

  pdf = "a.pdf"

  uri = Gst.filename_to_uri(os.path.abspath(pdf))

  doc = Poppler.Document.new_from_file(uri, None)

  title = doc.get_title()

  print(title)


Is there a way that I can extract the /Lang value from the /Catalog
dictionary? (Attached PDF document with that entry.)

I’m afraid I searched https://lazka.github.io/pgi-docs/, but I wasn’t
able to find anything that could give the language from the document.

Many thanks for your help,

Pablo

Attachment: a.pdf
Description: Adobe PDF document

Reply via email to