Hi,

I've worked around the issue by setting omitNorms=true on the title field. Now 
all fieldNorm values are 1.0f and therefore do not mess up my scores anymore. 
This, of course, is hardly a solution even though i currently do not use 
index-time boosts on any field.

The question remains, why does the title field return a fieldNorm=0 for many 
queries? And a subquestion, does the luke request handler return boost values 
for documents? I know i get boost values for fields but i haven't seen boost 
values for documents. 

Cheers,

On Wednesday 03 November 2010 20:44:48 Markus Jelsma wrote:
> > Regarding "Negative or zero value for fieldNorm", I don't see any
> > negative fieldNorms here... just very small positive ones?
> 
> Of course, you're right. The E-# got twisted in my mind and became
> negative. Silly me.
> 
> > Anyway the fieldNorm is the product of the lengthNorm and the
> > index-time boost of the field (which is itself the product of the
> > index time boost on the document and the index time boost of all
> > instances of that field).  Index time boosts default to "1" though, so
> > they have no effect unless something has explicitly set a boost.
> 
> I've just checked docs 7 and 1462 (resp. first and second in debug output
> below) with Luke. The title and content fields have no index time boosts,
> thus defaulting to 1.0f which is fine.
> 
> Then, why does doc 7 have a fieldNorm of 0.0 on title (and so setting a 0.0
> score on the doc in the result set) and does doc 1462 have a very very
> small fieldNorm?
> 
> debugOutput for doc 7:
> 0.0 = fieldNorm(field=title, doc=7)
> 
> Luke on the title field of doc 7.
> <float name="boost">1.0</float>
> 
> Thanks for your reply!
> 
> > -Yonik
> > http://www.lucidimagination.com
> > 
> > 
> > 
> > On Wed, Nov 3, 2010 at 2:30 PM, Markus Jelsma
> > 
> > <markus.jel...@openindex.io> wrote:
> > > Hi all,
> > > 
> > > I've got some puzzling issue here. During tests i noticed a document at
> > > the bottom of the results where it should not be. I query using DisMax
> > > on title and content field and have a boost on title using qf. Out of
> > > 30 results, only two documents also have the term in the title.
> > > 
> > > Using debugQuery and fl=*,score i quickly noticed large negative
> > > maxScore of the complete resultset and a portion of the resultset
> > > where scores sum up to zero because of a product with 0 (fieldNorm).
> > > 
> > > See below for debug output for a result with score = 0:
> > > 
> > > 0.0 = (MATCH) sum of:
> > >  0.0 = (MATCH) max of:
> > >    0.0 = (MATCH) weight(content:kunstgrasveld in 7), product of:
> > >      0.75658196 = queryWeight(content:kunstgrasveld), product of:
> > >        6.6516633 = idf(docFreq=33, maxDocs=9682)
> > >        0.113743275 = queryNorm
> > >      
> > >      0.0 = (MATCH) fieldWeight(content:kunstgrasveld in 7), product of:
> > >        2.236068 = tf(termFreq(content:kunstgrasveld)=5)
> > >        6.6516633 = idf(docFreq=33, maxDocs=9682)
> > >        0.0 = fieldNorm(field=content, doc=7)
> > >    
> > >    0.0 = (MATCH) fieldWeight(title:kunstgrasveld in 7), product of:
> > >      1.0 = tf(termFreq(title:kunstgrasveld)=1)
> > >      8.791729 = idf(docFreq=3, maxDocs=9682)
> > >      0.0 = fieldNorm(field=title, doc=7)
> > > 
> > > And one with a negative score:
> > > 
> > > 3.0716116E-4 = (MATCH) sum of:
> > >  3.0716116E-4 = (MATCH) max of:
> > >    3.0716116E-4 = (MATCH) weight(content:kunstgrasveld in 1462),
> > >    product
> > > 
> > > of: 0.75658196 = queryWeight(content:kunstgrasveld), product of:
> > > 6.6516633 = idf(docFreq=33, maxDocs=9682)
> > > 
> > >        0.113743275 = queryNorm
> > >      
> > >      4.059853E-4 = (MATCH) fieldWeight(content:kunstgrasveld in 1462),
> > > 
> > > product of:
> > >        1.0 = tf(termFreq(content:kunstgrasveld)=1)
> > >        6.6516633 = idf(docFreq=33, maxDocs=9682)
> > >        6.1035156E-5 = fieldNorm(field=content, doc=1462)
> > > 
> > > There are no funky issues with term analysis for the text fieldType, in
> > > fact, the term passes through unchanged. I don't do omitNorms, i store
> > > termVectors etc.
> > > 
> > > Because fieldNorm = fieldBoost / sqrt(numTermsForField) i suspect my
> > > input from Nutch is messed up. A fieldNorm can never be =< 0 for a
> > > normal positive boost and field boosts should not be zero or negative
> > > (correct me if i'm wrong). But, since i can't yet figure out what field
> > > boosts Nutch sends to me i thought i'd drop by on this mailing list
> > > first.
> > > 
> > > There are quite a few query terms that return with zero or negative
> > > scores and many that behave as i expect. I find it also a bit hard to
> > > comprehend why the docs with negative score rank higher in the result
> > > set than documents with zero score. Sorting defaults to score DESC, 
> > > but this is perhaps another issue.
> > > 
> > > Anyway, the test runs on a Solr 1.4.1 instance with Java 6 under the
> > > hood. Help or directions are appreciated =)
> > > 
> > > Cheers,
> > > 
> > > --
> > > Markus Jelsma - CTO - Openindex
> > > http://www.linkedin.com/in/markus17
> > > 050-8536600 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536600 / 06-50258350

Reply via email to