Matt, Seeing the response, my guess is you have "point" in your index, and that it has a higher frequency than "rockpoint". By default the spellchecker will never try to correct something that exists in your index. Adding "spellcheck.onlyMorePopular=true" might help, but only if the correction has a higher frequency than the original. Try using "spellcheck.alternativeTermCount=n" instead of "spellcheck.onlyMorePopular=true". See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount for more information.
James Dyer Ingram Content Group (615) 213-4311 -----Original Message----- From: Matt Mongeau [mailto:halogenandto...@gmail.com] Sent: Monday, December 15, 2014 10:23 AM To: solr-user@lucene.apache.org Subject: Re: WordBreakSolrSpellChecker Usage I think you were right about maxChanges, that does seem get rid of the ridiculous values. However I don't seem to be getting anything reasonable. Most variations look something like: http://localhost:8982/solr/development/select?q=Rock+point&fq=type%3ACompany&wt=ruby&indent=true&defType=edismax&qf=name_text&stopwords=true&lowercaseOperators=true&spellcheck=true&spellcheck.count=20&spellcheck.onlyMorePopular=true&spellcheck.extendedResults=true&spellcheck.collate=true&spellcheck.maxCollations=1&spellcheck.maxCollationTries=10&spellcheck.accuracy=0.5 { 'responseHeader'=>{ 'status'=>0, 'QTime'=>20}, 'response'=>{'numFound'=>0,'start'=>0,'docs'=>[] }, 'spellcheck'=>{ 'suggestions'=>[ 'rock',{ 'numFound'=>5, 'startOffset'=>0, 'endOffset'=>4, 'origFreq'=>3, 'suggestion'=>[{ 'word'=>'rocky', 'freq'=>3}, { 'word'=>'brook', 'freq'=>6}, { 'word'=>'york', 'freq'=>460}, { 'word'=>'oak', 'freq'=>7}, { 'word'=>'boca', 'freq'=>3}]}, 'correctlySpelled',false]}} I'm going to post both my solrconfig.xml and schema.xml because maybe I'm just doing something crazy. They can both be found here: https://gist.github.com/halogenandtoast/76fd5dcfae1c4edeba30 On Thu, Dec 11, 2014 at 1:19 PM, Dyer, James <james.d...@ingramcontent.com> wrote: > > Matt, > > There is no exact number here, but I would think most people would want > "count" to be maybe 10-20. Increasing this incurs a very small performance > penalty for each term it generates suggestions for, but you probably won't > notice a difference. For "maxCollationTries", 5 is a reasonable number but > you might see improved collations if this is also perhaps 10. With this > one, you get a much larger performance penalty, but only when it need to > try more combinations to return the "maxCollations". In your case you have > this at 5 also, right? I would reduce this to the maximum number of > re-written queries your application or users is actually going to use. In > a lot of cases, 1 is the right number here. This would improve performance > for you in some cases. > > Possibly the reason “Rock point” > “Rockpoint” is failing is because you > have "maxChanges" set to 10. This tells it you are willing for it to break > a word into 10 separate parts, or to combine up to 10 adjacent words into > 1. Having taken a quick glance at the code, I think what is happening is > it is trying things like "r ock p oint" and "r o ck p o int", etc and never > getting to your intended result. In a typical scenario I would set > "maxChanges" to 1-3, and often 1 is probably the most appropriate value > here. > > James Dyer > Ingram Content Group > (615) 213-4311 > > > -----Original Message----- > From: Matt Mongeau [mailto:halogenandto...@gmail.com] > Sent: Thursday, December 11, 2014 11:34 AM > To: solr-user@lucene.apache.org > Subject: Re: WordBreakSolrSpellChecker Usage > > Is there a suggested value for this. I bumped them up to 20 and still > nothing has seemed to change. > > On Thu, Dec 11, 2014 at 9:42 AM, Dyer, James <james.d...@ingramcontent.com > > > wrote: > > > My first guess here, is seeing it works some of the time but not others, > > is that these values are too low: > > > > <str name="spellcheck.maxCollationTries">5</str> > > <str name="spellcheck.count">5</str> > > > > You know spellcheck.count is too low if the suggestion you want is not in > > the "suggestions" part of the response, but increasing it makes it get > > included. > > > > You know that spellcheck.maxCollationTries is too low if it exists in > > "suggestions" but it is not getting suggested in the "collation" section. > > > > James Dyer > > Ingram Content Group > > (615) 213-4311 > > > > > > -----Original Message----- > > From: Matt Mongeau [mailto:halogenandto...@gmail.com] > > Sent: Wednesday, December 10, 2014 12:43 PM > > To: solr-user@lucene.apache.org > > Subject: Fwd: WordBreakSolrSpellChecker Usage > > > > If I have my search component setup like this > > https://gist.github.com/halogenandtoast/cf9f296d01527080f18c and I have > an > > entry for “Rockpoint” shouldn’t “Rock point” generate suggestions? > > > > This doesn't seem to be the case, but it works for "Blackstone" with > "Black > > stone". Any ideas on what I might be doing wrong? > > >