I've been testing the SpellCheckComponent for use on StyleFeeder. It seems
to do a great job of suggesting character substitutions, but I haven't seen
any deletion/insertion suggestions. I've tried decreasing the "accuracy"
parameter to 0.5. Some queries I've tried are:
bluea: suggests "blues" (should be "blue")
yello: no suggestions (should be "yellow")
candyz: suggests "candyâ" (should be "candy")
chane: no suggestions (should be "chanel")
It looks to me like it is only willing to make character substitutions and
is unwilling to insert/delete characters. Does anyone know why it might be
behaving this way? I'm certain that the "should be" words appear fairly
frequently in the field I used for spellcheck indexing. And, I reindexed
the documents after setting up the spellchecker.
Not sure if this would help to debug, but I noticed that words appear with
different frequency in the spellcheck index file (.cfs in the spellcheck
dir). I.e. here's what I get for a few variants on "blue":
[EMAIL PROTECTED] spellchecker]$ strings _2y.cfs | grep ^blue$|wc
46 46 230
[EMAIL PROTECTED] spellchecker]$ strings _2y.cfs | grep ^bluea$|wc
0 0 0
[EMAIL PROTECTED] spellchecker]$ strings _2y.cfs | grep ^blues$|wc
3 3 18
All the "should be" words appear 10+ times. The misspellings appear 0 or 1
times.
Any help is appreciated. Thanks,
Jason