On Tue, September 19, 2006 13:27, Agustin Martin wrote:
> On Wed, Sep 13, 2006 at 02:52:26PM +0300, Martin-Éric Racine wrote:
>>
>> On Fri, September 8, 2006 18:13, Stanislav Maslovski said:
>
>> > The word list /usr/share/aspell/ru.cwl.gz contains "138420" as the
>> very
>> > first word, which is a mistake. Here is the complete warning message I
>> got
>> > from "dpkg-reconfigure aspell-ru" (in russian)
>> >
>> > -------------------------------------------------------------------------
>> > aspell-autobuildhash: processing: ru [ru]
>> > ðÒÅÄÕÐÒÅÖÄÅÎÉÅ: The word "138420" is invalid. óÉÍ×ÏÌ '1' (U+31) ÎÅ
>> ÍÏÖÅÔ
>> > ÐÏÑ×ÌÑÔØÓÑ × ÎÁÞÁÌÅ ÓÌÏ×Á. ðÒÏÐÕÓËÁÅÔÓÑ ÓÌÏ×Ï.
>> > -------------------------------------------------------------------------
>>
>> That is perfectly inoffensive. The aspell dictionary is generated from
>> the
>> myspell dictionary, which starts with a wordcount number. Aspell
>> discards
>> it when it builds the hash, as you saw, but it is needlessly verbose
>> about
>> it.
>
> You can use 'sed 1d' to strip first line and make this less verbose. This
> is
> what I do for the esperanto dictionary (with eo changed to ru):
>
> cat ru.dic | sed 1d | LC_COLLATE=C sort -u | prezip > ru.cwl

Except that, unless I'm mistaken, using the C collation rules for
languages other than English languages is wrong.

-- 
Martin-Éric Racine
http://q-funk.iki.fi



Reply via email to