SOLR 7.1 ClassicSimilarityFactory Problem

Hodder, Rick Fri, 20 Jul 2018 08:41:57 -0700

I am using SOLR 7.1
ClassicSimilarityFactory
I have data in my core with field called CompanyName in an indexed field 
IDX_CompanyName


<field name="IDX_CompanyName " type="text_general" indexed="true" 
stored="false" multiValued="true" />
<field name="CompanyName" type="string" indexed="true" stored="true"/>
<copyField source="CompanyName" dest=" IDX_CompanyName"/>

Here are a few of the 900,000 rows in the core

Cityview
Citadel
CivicVentures
Clutch City Sports
Clutch City Sports &amp; Entertainment
Clutch City Sports &amp; Entertainment
Clutch City Sports &amp; Entertainment


If I search for IDX_Company:(clutch AND city) and a fl=*,score and maxrows of 
750, and at 1500 I get the following results

CompanyName                Score
Cityview                               5.874983
Citadel                                  5.3502507
CivicVentures                    4.7278214
<other rows, but no clutch city>

If I search for IDX_Company:(clutch AND city) and a maxrows of 5000 I get the 
following results

CompanyName                                                                Score
Cityview                                                                        
       5.874983
Citadel                                                                         
         5.3502507
CivicVentures                                                                   
 4.7278214
Clutch City Sports &amp; Entertainment                3.6542892
Clutch City Sports &amp; Entertainment                3.6542892
Clutch City Sports &amp; Entertainment                3.6542892

Ive tried looking at the debug query to figure out what its doing and I'm 
confused by what it is saying

The debug info for Cityview is

<str name="366640">
5.874983 = sum of:
  1.9583277 = weight(Synonym(IDX_CompanyName:c IDX_ CompanyName:cl IDX_ 
CompanyName:clu IDX_CompanyName:clut IDX_CompanyName:clutc 
IDX_CompanyName:clutch) in 16639) [ClassicSimilarity], result of:
    1.9583277 = fieldWeight in 16639, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      1.9583277 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
        166407.0 = docFreq
        433880.0 = docCount
      1.0 = fieldNorm(doc=16639)
  3.9166553 = weight(Synonym(IDX_ CompanyName:c IDX_ CompanyName:ci IDX_ 
CompanyName:cit IDX_ CompanyName:city) in 16639) [ClassicSimilarity], result of:
    3.9166553 = fieldWeight in 16639, product of:
      2.0 = tf(freq=4.0), with freq of:
        4.0 = termFreq=4.0
      1.9583277 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
        166407.0 = docFreq
        433880.0 = docCount
      1.0 = fieldNorm(doc=16639)
</str>

The debug info for Clutch City Sports &amp; Entertainment is

<str name="409550">
3.6542892 = sum of:
  1.9583277 = weight(Synonym(IDX_CompanyName:c IDX_ CompanyName:cl IDX_ 
CompanyName:clu IDX_ CompanyName:clut IDX_ CompanyName:clutc IDX_ 
CompanyName:clutch) in 9549) [ClassicSimilarity], result of:
    1.9583277 = fieldWeight in 9549, product of:
      2.828427 = tf(freq=8.0), with freq of:
        8.0 = termFreq=8.0
      1.9583277 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
        166407.0 = docFreq
        433880.0 = docCount
      0.35355338 = fieldNorm(doc=9549)
  1.6959615 = weight(Synonym(IDX_ CompanyName:c IDX_ CompanyName:ci IDX_ 
CompanyName:cit IDX_ CompanyName:city) in 9549) [ClassicSimilarity], result of:
    1.6959615 = fieldWeight in 9549, product of:
      2.4494898 = tf(freq=6.0), with freq of:
        6.0 = termFreq=6.0
      1.9583277 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
        166407.0 = docFreq
        433880.0 = docCount
      0.35355338 = fieldNorm(doc=9549)
</str>

Why would something with 2 hits score lower? Why does the max rows influence 
this?

How might I fix this?

This didn't used to happen in SOLR 4.10 (I know its an older version, but...)


Thanks,

Rick Hodder
Information Technology
Navigators Management Company, Inc.
83 Wooster Heights Road, 2nd Floor
Danbury, CT  06810
(475) 329-6251

[Forbes_Best Places Logo2016]

SOLR 7.1 ClassicSimilarityFactory Problem

Reply via email to