This doesn't make a lot of sense to me as in both cases the very first 
collation it tries is the one it is returning.  So you're getting a very 
optimized spellcheck in both cases.  But it does have to issue both queries 2 
times:  the first time, it tries the user's main query anding there are not 
enough hits, it then tries the collation query to see how many hits that will 
return.  Could it be that these two queries just are less/more expensive and 
that difference gets magnified by running each twice?

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: SandeepM [mailto:skmi...@hotmail.com] 
Sent: Monday, April 22, 2013 4:04 PM
To: solr-user@lucene.apache.org
Subject: RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

Chocolat Factry


<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">77</int>
</lst>
<result name="response" numFound="0" start="0">
</result>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="chocolat">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">8</int>
      <int name="origFreq">615</int>
      <arr name="suggestion">
        <lst>
          <str name="word">chocolate</str>
          <int name="freq">6544</int>
        </lst>
      </arr>
    </lst>
    <lst name="factry">
      <int name="numFound">5</int>
      <int name="startOffset">9</int>
      <int name="endOffset">15</int>
      <int name="origFreq">6</int>
      <arr name="suggestion">
        <lst>
          <str name="word">factory</str>
          <int name="freq">23614</int>
        </lst>
        <lst>
          <str name="word">factor</str>
          <int name="freq">5128</int>
        </lst>
        <lst>
          <str name="word">factus</str>
          <int name="freq">290</int>
        </lst>
        <lst>
          <str name="word">factum</str>
          <int name="freq">178</int>
        </lst>
        <lst>
          <str name="word">factae</str>
          <int name="freq">102</int>
        </lst>
      </arr>
    </lst>
    <bool name="correctlySpelled">false</bool>
    <lst name="collation">
      <str name="collationQuery">chocolate factory</str>
      <int name="hits">85</int>
      <lst name="misspellingsAndCorrections">
        <str name="chocolat">chocolate</str>
        <str name="factry">factory</str>
      </lst>
    </lst>
  </lst>
</lst>
</response>




Pursut Hapyness
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">16</int>
</lst>
<result name="response" numFound="0" start="0">
</result>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="pursut">
      <int name="numFound">5</int>
      <int name="startOffset">0</int>
      <int name="endOffset">6</int>
      <int name="origFreq">0</int>
      <arr name="suggestion">
        <lst>
          <str name="word">pursuit</str>
          <int name="freq">1209</int>
        </lst>
        <lst>
          <str name="word">pursue</str>
          <int name="freq">108</int>
        </lst>
        <lst>
          <str name="word">pursit</str>
          <int name="freq">1</int>
        </lst>
        <lst>
          <str name="word">perdut</str>
          <int name="freq">94</int>
        </lst>
        <lst>
          <str name="word">purdue</str>
          <int name="freq">70</int>
        </lst>
      </arr>
    </lst>
    <lst name="hapyness">
      <int name="numFound">5</int>
      <int name="startOffset">7</int>
      <int name="endOffset">15</int>
      <int name="origFreq">0</int>
      <arr name="suggestion">
        <lst>
          <str name="word">happyness</str>
          <int name="freq">175</int>
        </lst>
        <lst>
          <str name="word">hapiness</str>
          <int name="freq">62</int>
        </lst>
        <lst>
          <str name="word">hayness</str>
          <int name="freq">1</int>
        </lst>
        <lst>
          <str name="word">happiness</str>
          <int name="freq">7788</int>
        </lst>
        <lst>
          <str name="word">harkness</str>
          <int name="freq">324</int>
        </lst>
      </arr>
    </lst>
    <bool name="correctlySpelled">false</bool>
    <lst name="collation">
      <str name="collationQuery">pursuit happyness</str>
      <int name="hits">10</int>
      <lst name="misspellingsAndCorrections">
        <str name="pursut">pursuit</str>
        <str name="hapyness">happyness</str>
      </lst>
    </lst>
  </lst>
</lst>
</response>

Spellcheck is used separately and we are not using any q along with
spellcheck.

Our search query also queries other fields, not just spellcheck and
therefore does not give a good representation of Qtime.   We use groupings
in the search query.
For Chocolate Factory, I get a search QTime of 198ms
For Pursuit Happyness, I get a search QTime of 318ms

Would appreciate your insights.
Thanks.
-- Sandeep




--
View this message in context: 
http://lucene.472066.n3.nabble.com/DirectSolrSpellChecker-vastly-varying-spellcheck-QTime-times-tp4057176p4058086.html
Sent from the Solr - User mailing list archive at Nabble.com.


Reply via email to