El Dissabte, 7 d'abril de 2012, a les 04:51:58, Ihar `Philips` Filipau va escriure: > On 4/6/12, Albert Astals Cid <[email protected]> wrote: > > El Diumenge, 1 d'abril de 2012, a les 11:57:59, Ihar `Philips` Filipau va > > > > escriure: > >> Add version to produced XML file. > > > > This needs an update to the dtd too, doesn't it? > > No Clue. Not really an XML specialist. My XML reader (libxml2 based) > has optional DTD validation which I have never used. Otherwise, I have > no idea why DTD is even needed - to me it kind of defies purpose of > XML.
> > Considering that Googling revealed about 7 distinctly different > pdf2xml.dtd's, I think the best change in the area could have been > *removal* of the DTD. Or at least renaming it into something else, if > it is really needed. But that is too much of a change. There is a single pdf2xml.dtd for pdftohtml, ours. > Now bit more seriously. Is it possible to extract PDF file properties > (producer, date, etc) in some easier way, than what is present in the > pdfinfo tool? It uses the PDFDoc::getDocInfo() to access the > dictionary and then parses the data ... well, pretty much manually. > Manually assembling unicode characters, surrogate pairs, UnicodeMap > and all. If poppler has a method to parse the data for me, then I > would love to include the info into the XML output too. If no, then > let it be. > > P.S. The patch for the poppler version information in XML and DTD attached. Commited. Albert _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
