Hi,

I don't understand why the scorer is making a sum of the weight of the OR clauses. It seems to me that it is unbalancing the query scoring toward the term that has more alternatives. To me it would make more sense to have the max of the weight of query term alternatives.

Here is an example:
I ran in the solr admin interface: gucci (handbag OR purse OR pocketbook)
By clicking debug I can see that the parsed query is as expected: "parsedquery":"text:gucci (text:handbag text:purse text:pocketbook)" The explain field shows that the scorer is making (I simplify a bit here): weight(gucci) + sum( weight(handbag) + weight(purse) + weight(pocketbook)) The consequence is that a result containing handbag, purse and pocketbook is going to have a higher score than a result containing gucci and handbag. I think this is counter-intuitive. To me the OR means those terms are equivalent, not that they are more important. Besides I could use query term boosting to do this independently.

I experimented with Edismax and it has similar behaviour.

The question are, am I missing something ? Is there a way to have an OR clause which preserve query term relative "importance" (note that playing with mm in edismax does not solve the issue) ?

Thanks !


Reply via email to