On Sun, Dec 09, 2007 at 09:21:44PM +0100, Agustin Martin wrote: > What about > > # This generates the wcatalan wordlist. > debian/strip_mwl | ispell -d $(CURDIR)/catala.debian -e | \ > tr -s ' ' '\n' | sort -u > catala.words.debian > > using sort with the --unique (-u) option.
> You can test with > > $ sort -u /usr/share/dict/catala > catala.tmp > > sizes: > > 6519080 catala.tmp > 7450965 /usr/share/dict/catala > > $ grep -n embalsameu catala.tmp > 221517:embalsameu > > $ grep -n embalsameu /usr/share/dict/catala > 264507:embalsameu > 264520:embalsameu Checked with Marc's file. Besides locale dependent sorting, the only diference is $ diff -u catala.marc.resorted catala.orig.re-u-sorted --- catala.marc.resorted 2007-12-10 12:44:21.000000000 +0100 +++ catala.orig.re-u-sorted 2007-12-10 12:43:29.000000000 +0100 @@ -1,3 +1,4 @@ +179620 1a 1r 2a so 'sort -u' seems to work well. -- Agustin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]