> FYI: there's also an EPSDocumentGraphics2D in Apache XML Graphics > Commons [1], i.e. as open source under the same license as PDFBox. > > [1] http://xmlgraphics.apache.org/commons/ Thanks I will look into that as well > Usually you can't identify an isolated vector image inside a PDF as it > may be interleaved with normal text. Only if the images are embedded as > Form XObjects can you isolate them reliably. Or if the PDF is tagged but > PDFBox can't you help in that case, yet. Even if you can isolate it, > PDFBox will need to be able to paint just the selected part of a page. Well Adobe Acrobat was able to detect the images with it's "Export images" functionality so I assume they are embedded somehow by an XObject. I noticed you had an ExtractImages class, would I be able to modify this to extract vectors? Would I need it to give me a list of Fill/Stroke/Path data points in order for it to extract correctly?
---------------------------------------- > Date: Tue, 3 Feb 2009 17:23:18 +0100 > From: [email protected] > To: [email protected] > Subject: Re: Extract vectors > > On 03.02.2009 17:07:29 Graeme Kidd wrote: >> >> Thanks for the suggestion, >> I am a total beginner at this so any helpful advice is greatly appreaceated. >> >> I suppose I could use something like this http://www.jibble.org/epsgraphics/ >> to save it as an EPS file. > > FYI: there's also an EPSDocumentGraphics2D in Apache XML Graphics > Commons [1], i.e. as open source under the same license as PDFBox. > > [1] http://xmlgraphics.apache.org/commons/ > >> The only problem I have so far is how to detect if the image is a >> vector graphic in which case I can draw it then save it. Otherwise at the >> moment I will just be saving the entire page as an EPS file. > > Usually you can't identify an isolated vector image inside a PDF as it > may be interleaved with normal text. Only if the images are embedded as > Form XObjects can you isolate them reliably. Or if the PDF is tagged but > PDFBox can't you help in that case, yet. Even if you can isolate it, > PDFBox will need to be able to paint just the selected part of a page. > >> Thanks again for your help so far. >> >> >> ---------------------------------------- >>> Date: Tue, 3 Feb 2009 09:04:33 -0500 >>> Subject: Re: Extract vectors >>> From: [email protected] >>> To: [email protected]; [email protected] >>> >>> You can extend the PageDrawer class and have it do something other than >>> actually drawing ... >>> >>> I've extended it to draw a little differently and in .Net ... it's not a >>> small undertaking, but is possible. >>> >>> On 2/3/09, Graeme Kidd wrote: >>>> >>>> >>>> >>>> Hi, >>>> >>>> I was just wondering if I could use PDFBox to extract vecor graphics? >>>> >>>> Thanks. > > > > Jeremias Maerki > _________________________________________________________________ Windows Live Messenger just got better .Video display pics, contact updates & more. http://www.download.live.com/messenger
