On 10/16/06, bo_b <[EMAIL PROTECTED]> wrote:
I was wondering if it was possible to set a minimum score/relevance for search results? And how is the score calculated anyway?
http://lucene.apache.org/java/docs/scoring.html Making an arbitrary cuttoff mean something would be quite difficult.
I thought i read somewhere that lucene scores were normalized between 0..1, but that doesnt seem to be the case for solr?
Solr never normalizes scores since it may be easily done by the client - the maxScore is given in the results, so just divide all scores by maxScore. If Solr normalized scores, information would be thrown away and clients wouldn't be able to un-normalize if needed.
In our case we have indexed a 7 million posts vbulletin database. On a search page we have, we would like to be able to have a sidebar which includes a link to our vbulletin search that says "Found xxxx extra results in vbulletin". But searches in the vbulletin database returns an awful lots of hits(like 100.000+ for some queries), even though perhaps only the first handful seem relevant. So ideally we would like the link to say "Found 12 extra results in vbulletin", if the first 12 results had a high score, and result 13 to 100.000 had a low score.
You could try to analyze the scores yourself and see if there is a natural "break". -Yonik