Ok, here is the calculation of the score:

0.18314168  =  *2.3121865* * 0.15502669 * 1.4142135 * *2.3121865* * 0.15625

*2.3121865 is *multiplied twice here.  That is what I mean tf x idf^2 is
used instead of tf x idf.



On Wed, Oct 5, 2011 at 10:42 AM, Markus Jelsma
<markus.jel...@openindex.io>wrote:

> Hi,
>
> I don't see 2.3121865 * 2 anywhere in your debug output or something that
> looks like that.
>
>
> > Hi Markus,
> >
> > The idf calculation itself is correct.
> > What I am trying to understand here is  why idf value is multiplied twice
> > in the final score calculation. Essentially,  tf x idf^2 is used instead
> > of tf x idf.
> > I'd like to understand the rational behind that.
> >
> > On Wed, Oct 5, 2011 at 9:43 AM, Markus Jelsma
> <markus.jel...@openindex.io>wrote:
> > > In Lucene's default similarity idf = 1 + ln (numDocs / df + 1).
> > > 1 + ln(26 / 7) =~ 2.3121865
> > >
> > > I don't see a problem.
> > >
> > > > Hi,
> > > >
> > > >
> > > > When I examine the score calculation of DisMax in Solr,   it looks to
> > > > me that DisMax is using  tf x idf^2 instead of tf x idf.
> > > > Does anyone have insight why tf x idf is not used here?
> > > >
> > > > Here is the score contribution from one one field:
> > > >
> > > > score(q,c) =  queryWeight x fieldWeight
> > > >
> > > >                = tf x idf x idf x queryNorm x fieldNorm
> > > >
> > > > Here is the example that I used to derive the formula above. Clearly,
> > > > idf is multiplied twice in the score calculation.
> > > > *
> > >
> > >
> http://localhost:8983/solr/select/?q=GB&version=2.2&start=0&rows=10&inden
> > > t=
> > >
> > > > on&debugQuery=true&fl=id,score *
> > > >
> > > >     <str name="6H500F0">
> > > >
> > > > 0.18314168 = (MATCH) sum of:
> > > >   0.18314168 = (MATCH) weight(text:gb in 1), product of:
> > > >     0.35845062 = queryWeight(text:gb), product of:
> > > >       2.3121865 = idf(docFreq=6, numDocs=26)
> > > >       0.15502669 = queryNorm
> > > >
> > > >     0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of:
> > > >       1.4142135 = tf(termFreq(text:gb)=2)
> > > >       2.3121865 = idf(docFreq=6, numDocs=26)
> > > >       0.15625 = fieldNorm(field=text, doc=1)
> > > >
> > > > </str>
> > > >
> > > >
> > > > Thanks!
>

Reply via email to