Package: ocrfeeder Version: 0.8.1-4 Severity: important After ocrfeeder has successfully OCR'ed Russian text, it is unable to export it to any of the formats, dumping following errors to the console:
Export to ODT ================= Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 284, in exportToOdt self.exportToFormat('ODT', 'ODT') File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 281, in exportToFormat name) File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py", line 605, in exportPagesWithGenerator document_generator.addPage(page) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 293, in addPage self.addBoxes(page_data.data_boxes) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 78, in addBoxes self.addBox(data_box) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 66, in addBox self.addText(data_box) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 251, in addText text = data_box.getText().decode('utf-8') File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128) ==================== Export to HTML =================== Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 298, in exportDialog self.EXPORT_FORMATS[format][1]) File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 281, in exportToFormat name) File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py", line 606, in exportPagesWithGenerator document_generator.save() File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 207, in save ''' % {'title': self.name, 'body': self.bodies[i], 'previous_page': previous_page, 'next_page': next_page} UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 137: ordinal not in range(128) ==================== Export to TXT ==================== Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 298, in exportDialog self.EXPORT_FORMATS[format][1]) File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 281, in exportToFormat name) File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py", line 605, in exportPagesWithGenerator document_generator.addPage(page) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 364, in addPage self.addText(page.getTextFromBoxes()) File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 361, in addText self.text += unicode(newText, 'utf-8') TypeError: decoding Unicode is not supported ==================== -- System Information: Debian Release: buster/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.18.0-rc4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages ocrfeeder depends on: ii cuneiform 1.1.0+dfsg-7 ii ghostscript 9.22~dfsg-2.1 ii gir1.2-goocanvas-2.0 2.0.4-1 ii gir1.2-gtk-3.0 3.22.30-2 ii gir1.2-gtkspell3-3.0 3.0.9-2 ii iso-codes 3.79-1 ii python 2.7.15-3 ii python-enchant 2.0.0-1 ii python-gi 3.28.2-1+b1 ii python-lxml 4.2.3-1 ii python-pil 5.2.0-2 ii python-reportlab 3.5.2-1 ii python-sane 2.8.3-1+b2 ii tesseract-ocr 4.00~git2844-607e8fd8-2 Versions of packages ocrfeeder recommends: ii unpaper 6.1-2+b2 pn yelp <none> ocrfeeder suggests no packages. -- no debconf information