Greetings!

I want my spell-checker to be based on a file (/usr/share/dict/linux.words should suffice). Word-breaks features would also be a benefit. I have previously indexed my docs for searching with minimal alterations to the baseline Solr configuration. My "docs" are user-typed text, typically a paragraph or two. The Solr searching feature works very well with my local customization. With the success of using the search feature, I now move on to adding spell-checking capabilities to my project.

Though my archive of docs *does* contain many technical terms and coded site identifiers, I prefer not to use the index-based spellcheck at this time, because the archive has never been previously spell-checked and I'm apprehensive that misspelled words will appear in my suggestions. But the index-based spell-checker is the baseline configuration, so I need to change that to use file-based spell checking. Intuitively, this seems as simple as commenting out the IndexBasedSpellChecker XML section and uncommenting the FileBasedSpellChecker XML section in the solrconfig.xml file that I've customized. But in doing that, I have gotten quite bizarre results, and though I've had much help from some very smart (and patient) contributors on this forum, I still have never gotten spell-checking to work in any meaningful way, even using the debugger.

So, my question for now is:

Should setting up a file-based spell checker just a matter of starting with the baseline solrconfig.xml and commenting out the Index-based spell checker and uncommenting the File-based Spell Checker (and changing the SourceLocation value), or am I overlooking too much?? But my second question is, which "baseline" solrconfig.xml should I use as a starting point, because there are several solrconfig.xml file nested in the subfolders when I unzip the tarball? I'm using 5.3.0 in case that matters.

Thanks!
Mark


Reply via email to