Source: libexttextcat
Version: 3.2.0-2
Encoding of the Polish language model is broken. For example, line 25 of
pl.lm has:
³ 1649
which should be:
ł 1649
You can recover the encoding by filtering the file through:
iconv -f UTF-8 -t Windows-1252 | iconv -f ISO-8859-2 -t UTF-8
However, I wonder if the language models shouldn't be somehow
automatically rebuilt from the ShortTexts/*.txt files. (Encoding of
pl.txt appears to be correct.)
--
Jakub Wilk
--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org