Never mind: I'll just convert the PDF to EPUB, and edit the HTML files it contains.

On 29/08/2024 21:08, Gilles wrote:
Hello,

I noticed some typos in the text layer added by an OCR into a "bitmap" PDF, ie. pages are actually scanned pages.

I first tried opening the EPUB generated by Abbyy Finereader, but LibreOffice couldn't open it at all, while Sigil could after showing an error message but lacks a French dictionary to run the job (as far as I can tell).

As an alternative, pdftotext or mutool (convert) can extract the text layer from such PDF, but can they put it back after I fixed the typos?

Thank you.


Reply via email to