Re: normalizing the score

Chris Hostetter Mon, 25 Apr 2011 13:54:31 -0700


: All I found was: 
http://search.lucidimagination.com/search/document/9d06882d97db5c59/a_question_about_solr_score
: 
: where Hoss suggests to normalize depending on the maxScore.


to be clear, i do not (nor have i ever) suggested that someone normalize 
based on maxScore.

my point there was that when [people *insist* on providing osme sort of 
normalization, the maxScore is always available if they want to use it

: I am not comfortable with that since, at least, I want that a search for 
: "the wombats" in a directory of mathematical concepts, and display that 
: all scores are pretty bad and not display 1.0 for matches that are only 
: on the word "the".

the crux of the problem is in deciding what you want to normalize relative 
to -- the "ideal" solution is to normalize relative the maximum *possible* 
score for *any* query against your corpus, but that's not something that's 
generally feasible to do (and based on experiments i tried once, it didn't 
seem like it would be very useful anyway)

: It seems that the strategy would be to normalize by maxScore if the maxScore 
is bigger than 1.0.
: Can you confirm that?
: Isn't there going to be similar edge cases as above?
: 
: I remember a time where Lucene results' score were always normalized. 
: That seems to be not in SOLR, or?

once upon a time, lucene's most "beginer freindly" api did provide 
normalized scores, using the approach you described (divide by max score 
if max score greater then 1.0) and it had all of the problems you might 
expect -- but some people liked it because they had an irrational dislike 
for scores greater then 1.

Solr has never supported those psuedo-nromalize scores, and lucene's java 
API eventually got rid of them.

-Hoss

Re: normalizing the score

Reply via email to