It's even worse than that: for some languages like Polish, the
fingerprint file contains iso8859-2 characters but that is not reflected
in the file name at all.
Considering that UTF-8 is now universally supported in Debian, I think
libtextcat-data should provide a -utf8 version for each language.
Package: libtextcat-data
Version: 2.2-2
Severity: minor
The file containing the trigrams for the greek language is names
greek-iso8859-7.lm
though all other files only use the hyphen to separate language name from
encoding name. For the sake of automatic parsing consistency is desirable.
Furthe
2 matches
Mail list logo