Dear Era! Thank you very much for the link!
Yes, mguesser uses so called "N-Gram-Based Text Categorization", which is based on statistical analysis. The idea of implementing mguesser this way was inspired by "TextCat" which showed very good results in our experiment: http://odur.let.rug.nl/~vannoord/TextCat/ TextCat was written in Perl. mguesser in an implementation of the "N-Gram-Based Text Categorization" technique we wrote from scratch in C, with some our own ideas added. We also recreated all language maps (statistical files for languages) ourself, because they are not fully compatible with the original TextCat language maps. Do you know if some SMI demo guesser programs are available for download? I'm curious to compare which method gives better results. Thanks! -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]