That means that for that document "china" occurs in the title vs. "snowden" found in a document but not in the title.

-- Jack Krupansky

-----Original Message----- From: Joe Zhang
Sent: Tuesday, July 23, 2013 12:52 AM
To: solr-user@lucene.apache.org
Subject: Re: Question about field boost

Is my reading correct that the boost is only applied on "china" but not
"snowden"? How can that be?

My query is: q=china+snowden&qf=title^10 content


On Mon, Jul 22, 2013 at 9:43 PM, Joe Zhang <smartag...@gmail.com> wrote:

Thanks for your hint, Jack. Here is the debug results, which I'm having a
hard deciphering (the two terms are "china" and "snowden")...

0.26839527 = (MATCH) sum of:
  0.26839527 = (MATCH) sum of:
    0.26757246 = (MATCH) max of:
      7.9147343E-4 = (MATCH) weight(content:china in 249), product of:
        0.019873314 = queryWeight(content:china), product of:
          1.6649085 = idf(docFreq=46832, maxDocs=91058)
          0.01193658 = queryNorm
        0.039825942 = (MATCH) fieldWeight(content:china in 249), product
of:
          4.8989797 = tf(termFreq(content:china)=24)
          1.6649085 = idf(docFreq=46832, maxDocs=91058)
          0.0048828125 = fieldNorm(field=content, doc=249)
      0.26757246 = (MATCH) weight(title:china^10.0 in 249), product of:
        0.5836803 = queryWeight(title:china^10.0), product of:
          10.0 = boost
          4.8898454 = idf(docFreq=1861, maxDocs=91058)
          0.01193658 = queryNorm
        0.45842302 = (MATCH) fieldWeight(title:china in 249), product of:
          1.0 = tf(termFreq(title:china)=1)
          4.8898454 = idf(docFreq=1861, maxDocs=91058)
          0.09375 = fieldNorm(field=title, doc=249)
    8.2282536E-4 = (MATCH) max of:
      8.2282536E-4 = (MATCH) weight(content:snowden in 249), product of:
        0.03407834 = queryWeight(content:snowden), product of:
          2.8549502 = idf(docFreq=14246, maxDocs=91058)
          0.01193658 = queryNorm
        0.024145111 = (MATCH) fieldWeight(content:snowden in 249), product
of:
          1.7320508 = tf(termFreq(content:snowden)=3)
          2.8549502 = idf(docFreq=14246, maxDocs=91058)
          0.0048828125 = fieldNorm(field=content, doc=249)


On Mon, Jul 22, 2013 at 9:27 PM, Jack Krupansky <j...@basetechnology.com>wrote:

Maybe you're not doing anything wrong - other than having an artificial
expectation of what the true relevance of your data actually is. Many
factors go into relevance scoring. You need to look at all aspects of your
data.

Maybe your terms don't occur in your titles the way you think they do.

Maybe you need a boost of 500 or more...

Lots of potential maybes.

Relevancy tuning is an art and craft, hardly a science.

Step one: Know your data, inside and out.

Use the debugQuery=true parameter on your queries and see how much of the
score is dominated by your query terms in the non-title fields.

-- Jack Krupansky

-----Original Message----- From: Joe Zhang
Sent: Monday, July 22, 2013 11:06 PM
To: solr-user@lucene.apache.org
Subject: Question about field boost


Dear Solr experts:

Here is my query:

defType=dismax&q=term1+term2&**qf=title^100 content

Apparently (at least I thought) my intention is to boost the title field.
While I'm getting some non-trivial results, I'm surprised that the
documents with both term1 and term2 in title (I know such docs do exist in my repository) were not returned (or maybe ranked very low). The situation
does not change even when I use much larger boost factors.

What am I doing wrong?




Reply via email to