Hi Markus,

The idf calculation itself is correct.
What I am trying to understand here is  why idf value is multiplied twice in
the final score calculation. Essentially,  tf x idf^2 is used instead of tf
x idf.
I'd like to understand the rational behind that.





On Wed, Oct 5, 2011 at 9:43 AM, Markus Jelsma <markus.jel...@openindex.io>wrote:

> In Lucene's default similarity idf = 1 + ln (numDocs / df + 1).
> 1 + ln(26 / 7) =~ 2.3121865
>
> I don't see a problem.
>
> > Hi,
> >
> >
> > When I examine the score calculation of DisMax in Solr,   it looks to me
> > that DisMax is using  tf x idf^2 instead of tf x idf.
> > Does anyone have insight why tf x idf is not used here?
> >
> > Here is the score contribution from one one field:
> >
> > score(q,c) =  queryWeight x fieldWeight
> >                = tf x idf x idf x queryNorm x fieldNorm
> >
> > Here is the example that I used to derive the formula above. Clearly, idf
> > is multiplied twice in the score calculation.
> > *
> >
> http://localhost:8983/solr/select/?q=GB&version=2.2&start=0&rows=10&indent=
> > on&debugQuery=true&fl=id,score *
> >
> >     <str name="6H500F0">
> > 0.18314168 = (MATCH) sum of:
> >   0.18314168 = (MATCH) weight(text:gb in 1), product of:
> >     0.35845062 = queryWeight(text:gb), product of:
> >       2.3121865 = idf(docFreq=6, numDocs=26)
> >       0.15502669 = queryNorm
> >     0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of:
> >       1.4142135 = tf(termFreq(text:gb)=2)
> >       2.3121865 = idf(docFreq=6, numDocs=26)
> >       0.15625 = fieldNorm(field=text, doc=1)
> > </str>
> >
> >
> > Thanks!
>

Reply via email to