Package: tesseract-ocr
Version: 4.1.1-2.1
Severity: normal
X-Debbugs-Cc: debbug.tesser...@sideload.33mail.com
I have this line in an old shell script:
$ tesseract <(convert "$jpgFile" +dither -colors 2 -normalize -resize 1
pbm:-) - -l eng
Today that line fails with this output:
=8<--
Error in fopenReadStream: file not found
Error in pixRead: image file not found: P4
Image file P4 cannot be read!
Error during processing.
=8<--
The fact that the command was in an old shell script suggests that it
likely worked at one point in time. But certainly version 4.1.1-2.1 of
tesseract-ocr cannot handle shell-substituted files.
This report actually covers two bugs:
1) tesseract-ocr fails to process shell-substituted files.
2) tesseract-ocr does not inform the user. It should give a graceful
error msg. That is, if there is no intent to support
shell-substituted files, then the app should detect when such a
file is specified and inform the user using plain English in the
error msg stating that shell-substituted files are unsupported.
The man page should also disclose this limitation either in the
paragraph that covers the input file spec and/or in a new section
titled “LIMITATIONS”. Or if there is intent to support
substitution files, then it should be explicitly stated in the
man page.
Workaround:
If ImageMagick is executed separately to populate a regular file,
tesseract has no problem with using that regular file as input.
-- System Information:
Debian Release: 11.5
APT prefers stable-updates
APT policy: (990, 'stable-updates'), (990, 'stable-security'), (990,
'testing'), (990, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 5.10.0-19-amd64 (SMP w/2 CPU threads)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages tesseract-ocr depends on:
ii libarchive13 3.4.3-2+deb11u1
ii libc62.31-13+deb11u5
ii libcairo21.16.0-5
ii libfontconfig1 2.13.1-4.2
ii libgcc-s110.2.1-6
ii libglib2.0-0 2.66.8-1
ii libicu67 67.1-7
ii liblept5 1.79.0-1.1
ii libpango-1.0-0 1.46.2-3
ii libpangocairo-1.0-0 1.46.2-3
ii libpangoft2-1.0-01.46.2-3
ii libstdc++6 10.2.1-6
ii libtesseract44.1.1-2.1
ii tesseract-ocr-eng1:4.00~git30-7274cfa-1.1
ii tesseract-ocr-osd1:4.00~git30-7274cfa-1.1
tesseract-ocr recommends no packages.
tesseract-ocr suggests no packages.
-- no debconf information