El Dijous, 30 de maig de 2013, a les 00:12:12, Mihai Niculescu va escriure: > On 05/30/2013 12:01 AM, Albert Astals Cid wrote: > > El Dimecres, 29 de maig de 2013, a les 23:57:44, Mihai Niculescu va escriure: > >> mail list included. Replay below. > >> > >> On 05/29/2013 11:39 PM, Albert Astals Cid wrote: > >>> El Dimecres, 29 de maig de 2013, a les 21:54:43, Mihai Niculescu va > > > > escriure: > >>>> Hi, > >>>> > >>>> I am trying to get the bounding box of all content in a page(text, > >>>> images, tables, etc) in poppler, but I can’t figure this out. > >>>> > >>>> For example, I want to dublicate the result given by ghostscript: > >>>> gs -sDEVICE=bbox golfer.ps > >>>> > >>>> prints out > >>>> > >>>> %%BoundingBox: 0 25 583 732 > >>>> > >>>> %%HiResBoundingBox: 0.808497 25.009496 582.994503 > >>>> 731.809445 > >>>> > >>>> How can this be done with poppler? > >>> > >>> Is that the "real" bounding box or one of the pdf boxes (crop, bleed, > >>> etc)? > >>> > >>> Cheers, > >>> > >>> Albert > >>>> > >>>> Thanks, > >>>> Mihai > >>>> _______________________________________________ > >>>> poppler mailing list > >>>> [email protected] > >>>> http://lists.freedesktop.org/mailman/listinfo/poppler > >>> > >>> _______________________________________________ > >>> poppler mailing list > >>> [email protected] > >>> http://lists.freedesktop.org/mailman/listinfo/poppler > >> > >> Not the pdf boxes. I mean the union of all bounding boxes of all > >> elements (text, images, tables, glyphs, etc) in pdf. Let me explain more. > >> > >> I tried to loop over all QList<TextBox> but these do not include other > >> glyphs in the pdf (greek sigma - for summation in latex) or maybe > >> > >> poppler does not see it as a text: > >> QList<Poppler::TextBox*> wholetext = pdfPage->textList(); > >> > >> //float minX, maxX, minY, maxY; > >> > >> QRectF unitedTextbbox, textbbox; > >> > >> for(int i=0; i<wholetext.size(); ++i){ > >> > >> Poppler::TextBox* textBox = wholetext.at(i); > >> > >> textbbox = textBox->boundingBox(); > >> > >> if(i==0){ > >> > >> unitedTextbbox=textbbox; > >> > >> }else{ > >> > >> unitedTextbbox=unitedTextbbox.united(textbbox); > >> > >> } > >> > >> } > >> > >> This works great when there is simple text in pdf, but when there are > >> other symbols it does not. I need something like the example above but > >> to include all elements in the pdf. I'll go and use only poppler > >> (without qt4 wrapper) if I can have this. > > > > Can't think on how to get what you want easily to be honest. > > > > As a quick solution you can render the page at a relatively low res and > > work the bbox from it, just check if the 4 corners are the same color and > > iterate on that. > > > > Cheers, > > > > Albert > > That is a way I don't like it and hope not to do it. Can you give me > some directions on how should I approach this problem?
Implement an outputdev and keep track of the bounding boxes there is the only way i can think of. Cheers, Albert > > >> Cheers, > >> Mihai > >> _______________________________________________ > >> poppler mailing list > >> [email protected] > >> http://lists.freedesktop.org/mailman/listinfo/poppler > > > > _______________________________________________ > > poppler mailing list > > [email protected] > > http://lists.freedesktop.org/mailman/listinfo/poppler > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
