We are upgrading to Solr 4.8 from 3.5, and I was testing search results with
4.8.  I found that with an edismax request handler the pf param is being
treated differently.

In 3.5 it was used as a disjunction max with tiebreaker, i.e. the max score
from all the matching fields was taken and the tiebreaker*sum of other
fields was added to it.  In 4.8 it just adds all the match scores together
(from all the fields).  This is throwing some scores off quite a bit from
3.5.  The configuration for the handler is the same.

I couldn't find any mention of this change searching online, or any mention
of pf being treated differently than qf.  What changed in Solr 4 that caused
this?  Is there any way I can get the old behavior, of taking the max field
score plus the tie-breaker times the sum of the other field scores?  Maybe a
simple change to the format of my request handler:


More details:

Turning debug query on, the results are the following,  They clearly show
that that max is used with the tiebreaker in 3.5 but not 4.8 for pf:

query (3.5):
boost(+(((inlink_text:edg^1.2 | body:edg^0.5 | title:edg^1.2 |
meta_description:edg^0.5 | url_path:edg^1.2 | file_name:edg^1.2 |
primary_header:edg^1.2 | secondary_header:edg^0.5)~0.17
(inlink_text:detect^1.2 | body:detect^0.5 | title:detect^1.2 |
meta_description:detect^0.5 | url_path:detect^1.2 | file_name:detect^1.2 |
primary_header:detect^1.2 | secondary_header:detect^0.5)~0.17)~2)*
(inlink_text:"edg detect"~100^1.2 | body:"edg detect"~100^0.5 | title:"edg
detect"~100^1.2 | meta_description:"edg detect"~100^0.5 | url_path:"edg
detect"~100^1.2 | file_name:"edg detect"~100^1.2 | primary_header:"edg
detect"~100^1.2 | secondary_header:"edg
detect"~100^0.5)~0.17*,product(float(hier_score),pow(float(link_score),const(0.25))))

pf results for one (3.5):
<lst>
<bool name="match">true</bool>
<float name="value">1.5689207</float>
<str name="description">max plus 0.17 times others of:</str>
<arr name="details">
<lst>
<bool name="match">true</bool>
<float name="value">1.5596248</float>
<str name="description">...</str>
<arr name="details">...</arr>
</lst>
<lst>
<bool name="match">true</bool>
<float name="value">0.054681662</float>
<str name="description">...</str>
<arr name="details">...</arr>
</lst>
</arr>


However, in 4.8, "max" and the tie-breaker are nowhere to be seen for the pf
part of the query:
query (4.8):
boost(+(((inlink_text:edg^1.2 | body:edg^0.5 | title:edg^1.2 |
meta_description:edg^0.5 | url_path:edg^1.2 | file_name:edg^1.2 |
primary_header:edg^1.2 | secondary_header:edg^0.5)~0.17
(inlink_text:detect^1.2 | body:detect^0.5 | title:detect^1.2 |
meta_description:detect^0.5 | url_path:detect^1.2 | file_name:detect^1.2 |
primary_header:detect^1.2 | secondary_header:detect^0.5)~0.17)~2)* body:"edg
detect"~100^0.5 title:"edg detect"~100^1.2 url_path:"edg detect"~100^1.2
file_name:"edg detect"~100^1.2 primary_header:"edg detect"~100^1.2
secondary_header:"edg detect"~100^0.5 meta_description:"edg detect"~100^0.5
inlink_text:"edg
detect"~100^1.2*,product(float(hier_score),pow(float(link_score),const(0.25))))

pf results for one (4.8) (no max, both values are just listed under the "sum
of" element:
<lst>
<bool name="match">true</bool>
<float name="value">0.03554287</float>
<str name="description">...</str>
<arr name="details">...</arr>
</lst>
<lst>
<bool name="match">true</bool>
<float name="value">1.0933692</float>
<str name="description">...</str>
<arr name="details">...</arr>
</lst>



The Solr 4 handler used is the following - it's also the same as the 3.5
one:
 <requestHandler class="solr.SearchHandler" name="/sitewide">
    
     <lst name="defaults">
       <str name="defType">edismax</str>
       <str name="echoParams">explicit</str>
        <float name="tie">0.17</float>
         <str name="qf">
           body^0.5 title^1.2 url_path^1.2 file_name^1.2 primary_header^1.2
secondary_header^0.5 meta_description^0.5 inlink_text^1.2
         </str>
         <str name="pf">
           body^0.5 title^1.2 url_path^1.2 file_name^1.2 primary_header^1.2
secondary_header^0.5 meta_description^0.5 inlink_text^1.2
         </str>
         <int name="ps">100</int>
     <str name="boost">
       hier_score
     </str>
     <str name="boost">
       pow(link_score,0.25)
     </str>
     </lst>
     <lst name="spellchecker">
      
      <str name="spellcheck.onlyMorePopular">false</str>
      
      <str name="spellcheck.extendedResults">true</str>
      
      <str name="spellcheck.count">3</str>
      <str name="buildOnCommit">true</str>
     </lst>
     <arr name="last-components">
       <str>spellcheck</str>
     </arr>
  </requestHandler>





--
View this message in context: 
http://lucene.472066.n3.nabble.com/edismax-pf-param-not-resulting-in-disjunction-max-query-tp4146764.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to