Tesseract/Tess4J is a good OCR combo, Tess4j uses PDFBOX for pdf for pdf2imgs

-----Original Message-----
From: Tilman Hausherr <[email protected]> 
Sent: Tuesday, 3 October 2023 10:05
To: [email protected]
Subject: Re: extract text from a password protected PDF

Well yes, OCR, obviously.

You could also look at the source code of ExtractText and decide how you want 
to handle the permissions 😂

Tilman


On 02.10.2023 19:37, Robert Rodini wrote:
> Hi,
> I have had great success with PDFBOX Extract.  That is until the supplier of 
> the PDF decided to password protect the file from extraction.  Does anyone 
> know of any tools or techniques (e.g. OCR) that might help me extract the 
> text?
> Thanks,
> Bob R
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to