Package: python3-pyocr Version: 0.3.0-1 Severity: normal Dear Maintainer,
experimenting a bit with OCR, i stumbled upon pyocr. Unfortunately, I cannot seem to make it work (at least with the tesseract backend), as it seems to require a different tesseract version (that uses different cmdline arguments): ~~~ $ python3 Python 3.7.3 (default, Mar 26 2019, 07:25:18) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import pyocr >>> import pyocr.builders >>> from PIL import Image >>> captcha=Image.open("captcha.png") >>> tools = pyocr.get_available_tools() >>> tools [<module 'pyocr.tesseract' from '/usr/lib/python3/dist-packages/pyocr/tesseract.py'>] >>> tool = tools[0] >>> txt = tool.image_to_string(captcha, builder=pyocr.builders.TextBuilder()) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3/dist-packages/pyocr/tesseract.py", line 291, in image_to_string raise TesseractError(status, errors) pyocr.tesseract.TesseractError: (1, b"Error, unknown command line argument '-psm'\n") >>> ~~~ The Cuneiform backend works nicely. -- System Information: Debian Release: buster/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.19.0-4-amd64 (SMP w/4 CPU cores) Kernel taint flags: TAINT_WARN, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages python3-pyocr depends on: ii python3 3.7.3-1 ii python3-pil 5.4.1-1 Versions of packages python3-pyocr recommends: ii cuneiform 1.1.0+dfsg-7 ii tesseract-ocr 4.0.0-2 python3-pyocr suggests no packages. -- no debconf information