On 24/4/25 10:31, Max Nikulin wrote:

By the way, PDF files may be tagged for screen readers. Is there a dedicated structure to explicitly mark tables? It would be the best source for data extraction.


ISO 14289 is an accessibility standard for PDF. It allows for the creation of a "Tagged PDF" where semantic information, including table structures (<Table>, <TR>, <TH>, <TD>), can be embedded in a separate logical structure tree

You can download it for free at https://pdfa.org/resource/iso-14289-pdfua/

Whether your PDF generator uses it is another matter, as is whether your PDF reading module can handle it.

Reply via email to