https://bugs.kde.org/show_bug.cgi?id=506673
Bug ID: 506673
Summary: If a flipped document is scanned and then flipped
manually, the OCR produces garbled text
Classification: Applications
Product: Skanpage
Version First 25.04.1
Reported In:
Platform: Other
OS: Linux
Status: REPORTED
Severity: normal
Priority: NOR
Component: general
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
Created attachment 183010
--> https://bugs.kde.org/attachment.cgi?id=183010&action=edit
A sample document scanned up side down at the scanner side, and then flipped in
skanpage and exported with OCR enabled
SUMMARY
If I scan a document flipped at the scanner side, and then flip the pages in
Skanpage before exporting to PDF with OCR enabled, then the OCR seems to
produce garbled text.
Based on the output, my speculation is that when you do this, the OCR runs on
the unflipped pages and are misinterpreted which results in garbled text. I can
confirm that by scanning the same document right side up and then exporting
with OCR, this issue does not occur.
I have attached a sample PDF which I scanned upside down and ran the OCR on to
show what I mean.
STEPS TO REPRODUCE
1. Scan a document page upside down
2. Flip the page in Skanpage
3. Export the page with OCR enabled
4. Open the page in a PDF reader (Okular is what I used), select some text,
copy and paste into a text editor
OBSERVED RESULT
The text is completely garbled and does not match what the exported PDF
displays.
EXPECTED RESULT
The text should correspond to what the exported PDF displays
SOFTWARE/OS VERSIONS
Linux/KDE Plasma:
Linux Kernel 6.6.90-1-MANJARO (64-bit)
KDE Plasma Version: 6.3.5
KDE Frameworks Version: 6.14.0
Qt Version: 6.9.0
ADDITIONAL INFORMATION
I am using Tessaract 5.5.0 and Wayland. The scanner I am using for testing is
an HP Officejet Pro 8610/8620
--
You are receiving this mail because:
You are watching all bug changes.