Package: ocrodjvu Version: 0.4.2-1 Severity: normal When processing a copy of
http://fleksem.klf.uw.edu.pl/~jsbien/tmp/Trotz1/Trotz.djvu with ocrodjvu --language deu-f --render all -o Troc_deu-f.djvu --word-segmentation uax29 Trotz1/Trotz.djvu ocrodjvu crashed after 14 hours with the message: --8<---------------cut here---------------start------------->8--- - Page #1284 ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "/usr/lib/python2.5/threading.py", line 446, in run self.__target(*self.__args, **self.__kwargs) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 443, in page_thread result = self.process_page(page) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 428, in process_page html_file.close() File "/usr/lib/python2.5/contextlib.py", line 33, in __exit__ self.gen.throw(type, value, traceback) File "/usr/share/ocrodjvu/lib/_ocrodjvu.py", line 189, in recognize ocropus.wait() File "/usr/share/ocrodjvu/lib/ipc.py", line 58, in wait raise CalledProcessError(return_code, self.__command) CalledProcessError: Command 'ocroscript' returned non-zero exit status 1 --8<---------------cut here---------------end--------------->8--- I have several wishlist items related to the problem: 1. There should be a way to preserve ocrodjvu.djvused in the case of crash. 2. The user should have a choice where to store debugging output. I use rather small system partition and in consequence debugging large ocrodjvu job requires splitting it into smaller ones, which is obviously cumbersome. 3. In case the temporary files are preserved, it would be useful to be able to resume processing. Best regards Janusz -- System Information: Debian Release: squeeze/sid APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Kernel: Linux 2.6.32-trunk-686 (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages ocrodjvu depends on: ii djvulibre-bin 3.5.22-8 Utilities for the DjVu image forma ii python 2.5.4-9 An interactive high-level object-o ii python-argparse 1.1-1 optparse-inspired command-line par ii python-djvu 0.1.17-1 Python support for the DjVu image ii python-lxml 2.2.6-1 pythonic binding for the libxml2 a ii python-support 1.0.6.1 automated rebuilding support for P Versions of packages ocrodjvu recommends: ii ocropus 0.3.1-2 document analysis and OCR system ii python-pyicu 0.9-2 Python extension wrapping the ICU ii tesseract-ocr 2.04-2 Command line OCR tool Versions of packages ocrodjvu suggests: pn cuneiform <none> (no description available) -- no debconf information -- , dr hab. Janusz S. Bien, prof. UW - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics) jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org