Bug#962668: same error

2020-08-21 Thread yokota
Calibre uses "pdftohtml" to convert PDF files into other formats. Older "pdftohtml" provides wrong output around surrogate pair characters. This makes choke Python lxml library. Use the newest "pdftohtml" to solve this problem. Install the newest "poppler-utils" package (0.85.0-2) from Debian unst

Bug#962668: same error

2020-08-19 Thread Norbert Preining
The official version is still based on Python2, and the error message indicated problems with Python3 which turned around the complete character handling. Basically it means that some parts are not Python3 ready. If you can provide a small pdf that fails, please send it here or personally to me

Bug#962668: same error

2020-08-14 Thread Michael Meier
I've got exactly the same error (the one with the surrogates not allowed) with way too many PDFs. Already since months, always using the newest calibre version in testing. Now I've just installed the official version as described in https://calibre-ebook.com/download_linux and that version does