On Tue, 23 Jul 2019 15:11:44 +0000
Jeff Brown <[email protected]> wrote:

> Hi. I have been using OpenOffice for quite some time, and I am wondering if 
> there is an Apache product that would be enable me to convert .pdf files to 
> .odt. Thanks! Jeff
> 
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> 

If these are PDF files of any size, the best method is to use an OCR 
application (Optical Character Recognition) which will convert them into a text 
format.  You will then have access to the t ext and can reformat and edit it as 
you require.  Many scanners come with OCR applications, but there are some free 
applications downloadable from the Internet.  I have used Tesseract, with 
gimageReader as a front end, running on linux, but also available for Windows, 
with good success.

It is important to be aware that the accuracy of the OCR process varies 
according to the quality of the scan and of the original; typically there will 
be a small number of recognition errors, so careful proofreading is essential, 
particularly if the information is numeric.


-- 
Rory O'Farrell <[email protected]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to