Hi, I've worked around the issue by setting omitNorms=true on the title field. Now all fieldNorm values are 1.0f and therefore do not mess up my scores anymore. This, of course, is hardly a solution even though i currently do not use index-time boosts on any field.
The question remains, why does the title field return a fieldNorm=0 for many queries? And a subquestion, does the luke request handler return boost values for documents? I know i get boost values for fields but i haven't seen boost values for documents. Cheers, On Wednesday 03 November 2010 20:44:48 Markus Jelsma wrote: > > Regarding "Negative or zero value for fieldNorm", I don't see any > > negative fieldNorms here... just very small positive ones? > > Of course, you're right. The E-# got twisted in my mind and became > negative. Silly me. > > > Anyway the fieldNorm is the product of the lengthNorm and the > > index-time boost of the field (which is itself the product of the > > index time boost on the document and the index time boost of all > > instances of that field). Index time boosts default to "1" though, so > > they have no effect unless something has explicitly set a boost. > > I've just checked docs 7 and 1462 (resp. first and second in debug output > below) with Luke. The title and content fields have no index time boosts, > thus defaulting to 1.0f which is fine. > > Then, why does doc 7 have a fieldNorm of 0.0 on title (and so setting a 0.0 > score on the doc in the result set) and does doc 1462 have a very very > small fieldNorm? > > debugOutput for doc 7: > 0.0 = fieldNorm(field=title, doc=7) > > Luke on the title field of doc 7. > <float name="boost">1.0</float> > > Thanks for your reply! > > > -Yonik > > http://www.lucidimagination.com > > > > > > > > On Wed, Nov 3, 2010 at 2:30 PM, Markus Jelsma > > > > <markus.jel...@openindex.io> wrote: > > > Hi all, > > > > > > I've got some puzzling issue here. During tests i noticed a document at > > > the bottom of the results where it should not be. I query using DisMax > > > on title and content field and have a boost on title using qf. Out of > > > 30 results, only two documents also have the term in the title. > > > > > > Using debugQuery and fl=*,score i quickly noticed large negative > > > maxScore of the complete resultset and a portion of the resultset > > > where scores sum up to zero because of a product with 0 (fieldNorm). > > > > > > See below for debug output for a result with score = 0: > > > > > > 0.0 = (MATCH) sum of: > > > 0.0 = (MATCH) max of: > > > 0.0 = (MATCH) weight(content:kunstgrasveld in 7), product of: > > > 0.75658196 = queryWeight(content:kunstgrasveld), product of: > > > 6.6516633 = idf(docFreq=33, maxDocs=9682) > > > 0.113743275 = queryNorm > > > > > > 0.0 = (MATCH) fieldWeight(content:kunstgrasveld in 7), product of: > > > 2.236068 = tf(termFreq(content:kunstgrasveld)=5) > > > 6.6516633 = idf(docFreq=33, maxDocs=9682) > > > 0.0 = fieldNorm(field=content, doc=7) > > > > > > 0.0 = (MATCH) fieldWeight(title:kunstgrasveld in 7), product of: > > > 1.0 = tf(termFreq(title:kunstgrasveld)=1) > > > 8.791729 = idf(docFreq=3, maxDocs=9682) > > > 0.0 = fieldNorm(field=title, doc=7) > > > > > > And one with a negative score: > > > > > > 3.0716116E-4 = (MATCH) sum of: > > > 3.0716116E-4 = (MATCH) max of: > > > 3.0716116E-4 = (MATCH) weight(content:kunstgrasveld in 1462), > > > product > > > > > > of: 0.75658196 = queryWeight(content:kunstgrasveld), product of: > > > 6.6516633 = idf(docFreq=33, maxDocs=9682) > > > > > > 0.113743275 = queryNorm > > > > > > 4.059853E-4 = (MATCH) fieldWeight(content:kunstgrasveld in 1462), > > > > > > product of: > > > 1.0 = tf(termFreq(content:kunstgrasveld)=1) > > > 6.6516633 = idf(docFreq=33, maxDocs=9682) > > > 6.1035156E-5 = fieldNorm(field=content, doc=1462) > > > > > > There are no funky issues with term analysis for the text fieldType, in > > > fact, the term passes through unchanged. I don't do omitNorms, i store > > > termVectors etc. > > > > > > Because fieldNorm = fieldBoost / sqrt(numTermsForField) i suspect my > > > input from Nutch is messed up. A fieldNorm can never be =< 0 for a > > > normal positive boost and field boosts should not be zero or negative > > > (correct me if i'm wrong). But, since i can't yet figure out what field > > > boosts Nutch sends to me i thought i'd drop by on this mailing list > > > first. > > > > > > There are quite a few query terms that return with zero or negative > > > scores and many that behave as i expect. I find it also a bit hard to > > > comprehend why the docs with negative score rank higher in the result > > > set than documents with zero score. Sorting defaults to score DESC, > > > but this is perhaps another issue. > > > > > > Anyway, the test runs on a Solr 1.4.1 instance with Java 6 under the > > > hood. Help or directions are appreciated =) > > > > > > Cheers, > > > > > > -- > > > Markus Jelsma - CTO - Openindex > > > http://www.linkedin.com/in/markus17 > > > 050-8536600 / 06-50258350 -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536600 / 06-50258350