Greetings!
I want my spell-checker to be based on a file
(/usr/share/dict/linux.words should suffice). Word-breaks features
would also be a benefit. I have previously indexed my docs for
searching with minimal alterations to the baseline Solr configuration.
My "docs" are user-typed text, typically a paragraph or two. The Solr
searching feature works very well with my local customization. With the
success of using the search feature, I now move on to adding
spell-checking capabilities to my project.
Though my archive of docs *does* contain many technical terms and coded
site identifiers, I prefer not to use the index-based spellcheck at this
time, because the archive has never been previously spell-checked and
I'm apprehensive that misspelled words will appear in my suggestions.
But the index-based spell-checker is the baseline configuration, so I
need to change that to use file-based spell checking. Intuitively, this
seems as simple as commenting out the IndexBasedSpellChecker XML section
and uncommenting the FileBasedSpellChecker XML section in the
solrconfig.xml file that I've customized. But in doing that, I have
gotten quite bizarre results, and though I've had much help from some
very smart (and patient) contributors on this forum, I still have never
gotten spell-checking to work in any meaningful way, even using the
debugger.
So, my question for now is:
Should setting up a file-based spell checker just a matter of starting
with the baseline solrconfig.xml and commenting out the Index-based
spell checker and uncommenting the File-based Spell Checker (and
changing the SourceLocation value), or am I overlooking too much?? But
my second question is, which "baseline" solrconfig.xml should I use as a
starting point, because there are several solrconfig.xml file nested in
the subfolders when I unzip the tarball? I'm using 5.3.0 in case that
matters.
Thanks!
Mark