Bug#184333: mguesser statistical tests

Alexander Barkov Wed, 29 Nov 2006 22:18:20 -0800

Dear Era!

Thank you very much for the link!


Yes, mguesser uses so called "N-Gram-Based Text Categorization",
which is based on statistical analysis.

The idea of implementing mguesser this way was
inspired by "TextCat" which showed very good results
in our experiment:

http://odur.let.rug.nl/~vannoord/TextCat/

TextCat was written in Perl.

mguesser in an implementation of the "N-Gram-Based Text Categorization"
technique we wrote from scratch in C, with some our own ideas added.
We also recreated all language maps (statistical files for languages)
ourself, because they are not fully compatible with the original
TextCat language maps.

Do you know if some SMI demo guesser programs are available for
download? I'm curious to compare which method gives better results.

Thanks!



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#184333: mguesser statistical tests

Reply via email to