Hi Otis,

I have written a draft wiki entry for the spell checker:
http://wiki.apache.org/solr/SpellCheckerRequestHandler

I've learned that my initial observation about the suggestion ordering was
incorrect, it does in fact order the results by popularity (or term
frequency) of the word in the termSourceField, the problem I experienced was
caused by setting termSourceField to a field of type "text", which heavily
stemmed and analyzed the words.  I found that using the StandardTokenizer
and StandardFilter and removing the PorterStemmer and LowerCaseFilter from
the field schema really improved the spell checker performance.

I haven't included this info on the wiki page yet, I'll try to update it
soon when I have a bit more time.

cheers,
Tristan



On 7/8/07, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:

Tristan - good summary - want to copy that to the Solr Wiki?

Thanks,
Otis

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

----- Original Message ----
From: Tristan Vittorio <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Saturday, July 7, 2007 1:51:15 AM
Subject: Re: Spell Check Handler

I couldn't find any documention on the spell check handler either but
found
enough information from the solrconfig.xml file, simply search for
"SpellCheckerRequestHandler" (online version here):

http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/solrconfig.xml

You can view the original development discussion from JIRA (not sure how
helpful that will be for you though):
https://issues.apache.org/jira/browse/SOLR-81

In a nutshell, the configuration parameters available are::

suggestionCount: determines how many spelling suggestions are returned.
accuracy: a float value between 1.0 and 0.0 on how close the suggested
words
should match the original word being checked.
spellcheckerIndexDir and  termSourceField: check solrconfig.xml for a full
explanation.

In order to use the spell checking hander for the first time, you need to
explicitly build the spelling index with a sample query something like
this:

http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker&cmd=rebuild
<http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker>
Depending on how large you main index is, this rebuild operation could
take
a while.  Subsequent queries can omit '&cmd=rebuild' and will return
results
much faster:

http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker
<http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker>
The order of the suggestions returned seems to be based on the accuracy
figure (i.e. how close it matches the original word). it would be great to
be able to sort these suggested results based on term frequency / document
frequency of the suggested word in the main index, since the most accurate
suggestion may not always be the most relevant.

As far as I can tell there is currently no way of doing this using the
spellchecker handler alone (you could always run seperate standard queries
on each word suggestion and order by numDocs, but that would be very
inefficient), has anybody else tried to achieve this?

cheers,
Tristan



On 7/7/07, Andrew Nagy <[EMAIL PROTECTED] > wrote:
>
> Hello, is there any documentation on how to use the new spell check
> module?
>
> Thanks
> Andrew
>




Reply via email to