File-based Spelling

Mark Fenbers Mon, 12 Oct 2015 12:39:07 -0700

Greetings!

I'm attempting to use a file-based spell checker. My sourceLocation is/usr/share/dict/linux.words, and my spellcheckIndexDir is set to./data/spFile. BuildOnStartup is set to true, and I see nothing tosuggest any sort of problem/error in solr.log. However, in my./data/spFile/ directory, there are only two files: segments_2 with only71 bytes in it, and a zero-byte write.lock file. For a sourcedictionary having 480,000 words in it, I was expecting a bit moresubstance in the ./data/spFile directory. Something doesn't seem rightwith this.

Moreover, I ran a query on the word Fenbers, which isn't listed in thelinux.words file, but there are several similar words. The results Igot back were odd, and suggestions included the following:

fenber
f en be r
f e nb er
f en b er
f e n be r
f en b e r
f e nb e r
f e n b er
f e n b e r

But I expected suggestions like fenders, embers, and fenberry, etc. Ialso ran a query on Mark (which IS listed in linux.words) and got backtwo suggestions in a similar format. I played with configurables likechanging the fieldType from text_en to string and the characterEncodingfrom UTF-8 to ASCII, etc., but nothing seemed to yield any differentresults.

Can anyone offer suggestions as to what I'm doing wrong? I've beenstruggling with this for more than 40 hours now! I'm surprised mypersistence has lasted this long!


Thanks,
Mark

File-based Spelling

Reply via email to