Hi!

I am using SOLR 4.2.1.

My solrconfig.xml contains the following:

  <searchComponent name="MySpellcheck" class="solr.SpellCheckComponent">
       <str name="queryAnalyzerFieldType">text_spell</str>

     <lst name="spellchecker">
       <str name="name">MySpellchecker</str>
       <str name="field">spell</str>
       <str name="classname">solr.DirectSolrSpellChecker</str>
       <str name="distanceMeasure">internal</str>
       <float name="accuracy">0.5</float>
       <int name="maxEdits">2</int>
       <int name="minPrefix">1</int>
       <int name="maxInspections">5</int>
       <int name="minQueryLength">3</int>
       <float name="maxQueryFrequency">0.01</float>
       
     </lst>
 </searchComponent>

<requestHandler name="/select" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <int name="rows">10</int>
      <str name="df">id</str>
      <str name="spellcheck.dictionary">MySpellchecker</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">false</str>
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">10</str>
      <str name="spellcheck.maxResultsForSuggest">35</str>
      <str name="spellcheck.onlyMorePopular">true</str>
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.collateExtendedResults">false</str>
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">1</str>
      <str name="spellcheck.collateParam.q.op">AND</str>
    </lst>
    <arr name="last-components">
      <str>MySpellcheck</str>
    </arr>
  </requestHandler>

schema.xml with the spell field looks like:

                <fieldType name="text_spell" class="solr.TextField"
positionIncrementGap="100"  sortMissingLast="true" >
                        <analyzer type="index">
                                <tokenizer
class="solr.StandardTokenizerFactory" />
                                <filter class="solr.LowerCaseFilterFactory"
/>
                                <filter class="solr.StopFilterFactory"
ignoreCase="true"
                                         words="lang/stopwords_en.txt"
enablePositionIncrements="true" />
                        </analyzer>
                        <analyzer type="query">
                                <tokenizer
class="solr.StandardTokenizerFactory" />
                                <filter class="solr.LowerCaseFilterFactory"
/>
                                <filter class="solr.StopFilterFactory"
ignoreCase="true"
                                         words="lang/stopwords_en.txt"
enablePositionIncrements="true" />
                        </analyzer>
                </fieldType>

                <field name="spell" type="text_spell" indexed="true"
stored="false" multiValued="true" />

        <copyField source="title" dest="spell" />
        <copyField source="artist" dest="spell" />
 
My query:
http://host/solr/select?q=&spellcheck.q=chocolat%20factry&spellcheck=true&df=spell&fl=&indent=on&wt=xml&rows=10&version=2.2&echoParams=explicit

In this case, the intent is to correct "chocolat factry" with "chocolate
factory" which exists in my spell field index. I see a QTime from the above
query as somewhere between 350-400ms

I run a similar query replacing the spellcheck terms to "pursut hapyness"
whereas "pursuit happyness" actually exists in my spell field and I see
QTime of 15-17ms .

Both query produce collations correctly but there is order of magnitude
difference in QTime.  There is one edit per term in both cases or 2 edits in
each query. The length of words in both these queries seem identical. I'd
like to understand why there is this vast difference in QTime.  I would
appreciate any help with this since I am not sure how I can get any
meaningful performance numbers and attribute the slowness to anything in
particular. 

I also see a vast difference in QTime in another case.  Replace the search
terms in the above query with "over cuckoo's nest", "over cuccoo's nst",
etc.   "over cuckoo's nest" exists in my indexed spell field and so it
should find it almost immediately.  This query fails to produce any
collation and takes 10seconds. While the second query "over cuccoo's nst"
corrects the phrase and also returns in 24ms. Something does not sound right
here.

I would appreciate help with these.

Thanks in advance.
Regards,
-- Sandeep



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to