Hi Markus, The idf calculation itself is correct. What I am trying to understand here is why idf value is multiplied twice in the final score calculation. Essentially, tf x idf^2 is used instead of tf x idf. I'd like to understand the rational behind that.
On Wed, Oct 5, 2011 at 9:43 AM, Markus Jelsma <markus.jel...@openindex.io>wrote: > In Lucene's default similarity idf = 1 + ln (numDocs / df + 1). > 1 + ln(26 / 7) =~ 2.3121865 > > I don't see a problem. > > > Hi, > > > > > > When I examine the score calculation of DisMax in Solr, it looks to me > > that DisMax is using tf x idf^2 instead of tf x idf. > > Does anyone have insight why tf x idf is not used here? > > > > Here is the score contribution from one one field: > > > > score(q,c) = queryWeight x fieldWeight > > = tf x idf x idf x queryNorm x fieldNorm > > > > Here is the example that I used to derive the formula above. Clearly, idf > > is multiplied twice in the score calculation. > > * > > > http://localhost:8983/solr/select/?q=GB&version=2.2&start=0&rows=10&indent= > > on&debugQuery=true&fl=id,score * > > > > <str name="6H500F0"> > > 0.18314168 = (MATCH) sum of: > > 0.18314168 = (MATCH) weight(text:gb in 1), product of: > > 0.35845062 = queryWeight(text:gb), product of: > > 2.3121865 = idf(docFreq=6, numDocs=26) > > 0.15502669 = queryNorm > > 0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of: > > 1.4142135 = tf(termFreq(text:gb)=2) > > 2.3121865 = idf(docFreq=6, numDocs=26) > > 0.15625 = fieldNorm(field=text, doc=1) > > </str> > > > > > > Thanks! >