Hi,

I don't see 2.3121865 * 2 anywhere in your debug output or something that 
looks like that.


> Hi Markus,
> 
> The idf calculation itself is correct.
> What I am trying to understand here is  why idf value is multiplied twice
> in the final score calculation. Essentially,  tf x idf^2 is used instead
> of tf x idf.
> I'd like to understand the rational behind that.
> 
> On Wed, Oct 5, 2011 at 9:43 AM, Markus Jelsma 
<markus.jel...@openindex.io>wrote:
> > In Lucene's default similarity idf = 1 + ln (numDocs / df + 1).
> > 1 + ln(26 / 7) =~ 2.3121865
> > 
> > I don't see a problem.
> > 
> > > Hi,
> > > 
> > > 
> > > When I examine the score calculation of DisMax in Solr,   it looks to
> > > me that DisMax is using  tf x idf^2 instead of tf x idf.
> > > Does anyone have insight why tf x idf is not used here?
> > > 
> > > Here is the score contribution from one one field:
> > > 
> > > score(q,c) =  queryWeight x fieldWeight
> > > 
> > >                = tf x idf x idf x queryNorm x fieldNorm
> > > 
> > > Here is the example that I used to derive the formula above. Clearly,
> > > idf is multiplied twice in the score calculation.
> > > *
> > 
> > http://localhost:8983/solr/select/?q=GB&version=2.2&start=0&rows=10&inden
> > t=
> > 
> > > on&debugQuery=true&fl=id,score *
> > > 
> > >     <str name="6H500F0">
> > > 
> > > 0.18314168 = (MATCH) sum of:
> > >   0.18314168 = (MATCH) weight(text:gb in 1), product of:
> > >     0.35845062 = queryWeight(text:gb), product of:
> > >       2.3121865 = idf(docFreq=6, numDocs=26)
> > >       0.15502669 = queryNorm
> > >     
> > >     0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of:
> > >       1.4142135 = tf(termFreq(text:gb)=2)
> > >       2.3121865 = idf(docFreq=6, numDocs=26)
> > >       0.15625 = fieldNorm(field=text, doc=1)
> > > 
> > > </str>
> > > 
> > > 
> > > Thanks!

Reply via email to