Hi Otis, I have written a draft wiki entry for the spell checker: http://wiki.apache.org/solr/SpellCheckerRequestHandler
I've learned that my initial observation about the suggestion ordering was incorrect, it does in fact order the results by popularity (or term frequency) of the word in the termSourceField, the problem I experienced was caused by setting termSourceField to a field of type "text", which heavily stemmed and analyzed the words. I found that using the StandardTokenizer and StandardFilter and removing the PorterStemmer and LowerCaseFilter from the field schema really improved the spell checker performance. I haven't included this info on the wiki page yet, I'll try to update it soon when I have a bit more time. cheers, Tristan On 7/8/07, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
Tristan - good summary - want to copy that to the Solr Wiki? Thanks, Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share ----- Original Message ---- From: Tristan Vittorio <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Saturday, July 7, 2007 1:51:15 AM Subject: Re: Spell Check Handler I couldn't find any documention on the spell check handler either but found enough information from the solrconfig.xml file, simply search for "SpellCheckerRequestHandler" (online version here): http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/solrconfig.xml You can view the original development discussion from JIRA (not sure how helpful that will be for you though): https://issues.apache.org/jira/browse/SOLR-81 In a nutshell, the configuration parameters available are:: suggestionCount: determines how many spelling suggestions are returned. accuracy: a float value between 1.0 and 0.0 on how close the suggested words should match the original word being checked. spellcheckerIndexDir and termSourceField: check solrconfig.xml for a full explanation. In order to use the spell checking hander for the first time, you need to explicitly build the spelling index with a sample query something like this: http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker&cmd=rebuild <http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker> Depending on how large you main index is, this rebuild operation could take a while. Subsequent queries can omit '&cmd=rebuild' and will return results much faster: http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker <http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker> The order of the suggestions returned seems to be based on the accuracy figure (i.e. how close it matches the original word). it would be great to be able to sort these suggested results based on term frequency / document frequency of the suggested word in the main index, since the most accurate suggestion may not always be the most relevant. As far as I can tell there is currently no way of doing this using the spellchecker handler alone (you could always run seperate standard queries on each word suggestion and order by numDocs, but that would be very inefficient), has anybody else tried to achieve this? cheers, Tristan On 7/7/07, Andrew Nagy <[EMAIL PROTECTED] > wrote: > > Hello, is there any documentation on how to use the new spell check > module? > > Thanks > Andrew >