On Jul 29, 2009, at 6:55 AM, Vincent Pérès wrote:
Using the following query :
http://localhost:8983/solr/others/select/?debugQuery=true&q=anna%20lewis&rows=20&start=0&fl=*&qt=dismax
I get back around 100 results. Follow the two first :
<doc>
<str name="id">Person:151</str>
<str name="name_s">Victoria Davisson</str>
</doc>
<doc>
<str name="id">Person:37</str>
<str name="name_s">Anna Lewis</str>
</doc>
And the related debugs :
57.998047 = (MATCH) sum of:
0.048290744 = (MATCH) sum of:
0.024546575 = (MATCH) max plus 0.01 times others of:
0.024546575 = (MATCH) weight(text:anna^0.5 in 64288), product of:
0.027395602 = queryWeight(text:anna^0.5), product of:
0.5 = boost
5.734427 = idf(docFreq=564, numDocs=30400)
0.009554783 = queryNorm
0.8960042 = (MATCH) fieldWeight(text:anna in 64288), product
of:
1.0 = tf(termFreq(text:anna)=1)
5.734427 = idf(docFreq=564, numDocs=30400)
0.15625 = fieldNorm(field=text, doc=64288)
0.02374417 = (MATCH) max plus 0.01 times others of:
0.02374417 = (MATCH) weight(text:lewi^0.5 in 64288), product of:
0.026944114 = queryWeight(text:lewi^0.5), product of:
0.5 = boost
5.6399217 = idf(docFreq=620, numDocs=30400)
0.009554783 = queryNorm
0.88123775 = (MATCH) fieldWeight(text:lewi in 64288), product
of:
1.0 = tf(termFreq(text:lewi)=1)
5.6399217 = idf(docFreq=620, numDocs=30400)
0.15625 = fieldNorm(field=text, doc=64288)
57.949757 = (MATCH) FunctionQuery(ord(name_s)), product of:
1213.0 = ord(name_s)=1213
5.0 = boost
0.009554783 = queryNorm
5.006892 = (MATCH) sum of:
0.038405567 = (MATCH) sum of:
0.021955125 = (MATCH) max plus 0.01 times others of:
0.021955125 = (MATCH) weight(text:anna^0.5 in 62632), product of:
0.027395602 = queryWeight(text:anna^0.5), product of:
0.5 = boost
5.734427 = idf(docFreq=564, numDocs=30400)
0.009554783 = queryNorm
0.80141056 = (MATCH) fieldWeight(text:anna in 62632), product
of:
2.236068 = tf(termFreq(text:anna)=5)
5.734427 = idf(docFreq=564, numDocs=30400)
0.0625 = fieldNorm(field=text, doc=62632)
0.016450444 = (MATCH) max plus 0.01 times others of:
0.016450444 = (MATCH) weight(text:lewi^0.5 in 62632), product of:
0.026944114 = queryWeight(text:lewi^0.5), product of:
0.5 = boost
5.6399217 = idf(docFreq=620, numDocs=30400)
0.009554783 = queryNorm
0.61053944 = (MATCH) fieldWeight(text:lewi in 62632), product
of:
1.7320508 = tf(termFreq(text:lewi)=3)
5.6399217 = idf(docFreq=620, numDocs=30400)
0.0625 = fieldNorm(field=text, doc=62632)
4.968487 = (MATCH) FunctionQuery(ord(name_s)), product of:
104.0 = ord(name_s)=104
5.0 = boost
0.009554783 = queryNorm
I'm using a simple boost function :
<requestHandler name="dismax" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
text^0.5 name_s^5.0
</str>
<str name="pf">
name_s^5.0
</str>
<str name="bf">
name_s^5.0
</str>
</lst>
</requestHandler>
Can anyone explain to me why the first result is on top (the query
is 'anna
lewis') with a huge weight and nothing related (it seems the weight
come
from the name_s field...) ?
The ord function perhaps isn't doing what you want. It is returning
the term position, and thus it appears "Anna Lewis" is the 104th
name_s value in your index lexicographically. And of course "Victoria
Davisson" is much further down, at the 1203rd position. Maybe you
want rord instead? But probably not...
A second general question... is it possible to boost a field if the
query
match exactly the content of a field?
You can use set dismax to have a qs (query slop) factor which will
boost documents where the users terms are closer together (within the
number of terms distance specified).
Erik