I tried out Poppler 13 from Ubuntu 11.10 and I get the same results. As far as I understand, if I look for an object in XRef using fetch(), and that object is in an object stream, the XRef then uncompresses the object and returns it to me, so that I don't even know that it was compressed in the first place? If things don't work this way, what approach should I take?
That being said, I tried this approach with both Poppler 7 and 13 and two PDF files with object streams. When I do an XRef->fetch() with generation number 0 and object number of an object in the object stream, I get a null object for all objects except the first one that is packed in the object stream. The first one isn't extracted fully. Is this a known issue? Nedim On Mon, 2011-10-31 at 11:12 -0700, Josh Richardson wrote: > What kinds of objects are you interested in? I have a version of > pdftohtml which I believe is not yet merged into the master repo that > extracts images and fonts. > > --josh > > On 10/31/11 9:16 AM, "Nedim Srndic" <[email protected]> wrote: > > >Dear list, > > > >I am using the Poppler library (in the src/poppler folder, no bindings, > >version 7 from the Ubuntu 10.10 repos) and would like to retrieve all > >objects from a PDF file. Currently, I am running a loop on XRef and > >getting all the non-null objects from it, but it doesn't seem to > >retrieve objects from object streams. What solution would you propose > >for this problem? > > > >Thanks, > >Nedim Srndic > > > >_______________________________________________ > >poppler mailing list > >[email protected] > >http://lists.freedesktop.org/mailman/listinfo/poppler > > > _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
