Re: hl.requireFieldMatch and idf

Koji Sekiguchi Sat, 29 Mar 2008 08:02:01 -0700

Mike,

Thank you for your response.

cause:
If hl.requireFieldMatch set to true,DefaultSolrHighlight.getQueryScorer()
uses QueryScorer(Query,IndexReader,String) constructor in Lucene
highlighter.
Then the constructor calls getIdfWeightedTerms() to get an array of
WeightedTerm.
In getIdfWeightedTerms(), idf is calculated to get weighted terms.
And the calculated idf can be minus with un-optimized index.
Okay, _this_ is the true bug. I don't see how lucene can return anegative idf, optimized index or no.

I think that docFreq includes deleted docs count and this is Lucene'sfeature.

This feature causes a negative idf, as long as the following fomula is used:

// o.a.l.s.highlight.QueryTermExtractor.java
float idf=(float)(Math.log((float)totalNumDocs/(double)(docFreq+1)) + 1.0);

Does DefaultSolrHighlight.getQueryScorer() use
QueryScorer(Query,IndexReader,String)
by design? If no, I'm happy to open a ticket.
Indeed it is by design: this is how requireFieldMatch is implemented,as the lucene highlighter will require the field to match as well asthe term. A consequence of this is that the idf's as also folded intothe score, which is triggering the bug you are seeing.

Can we use QueryScorer(Query,String) instead ofQueryScorer(Query,IndexReader,String) to implement

hl.requireFieldMatch=true? I've opened SOLR-517 to follow up this problem.

Thank you,

Koji

Re: hl.requireFieldMatch and idf

Reply via email to