Re: How to boost the score higher in case user query matches entire field value than just some words within a field

Sean Timm Thu, 21 Aug 2008 12:33:51 -0700

https://issues.apache.org/jira/browse/LUCENE-1360


Simon Hu wrote:

I am definitely interested in trying your Similarity class. Can you please
post the patch in jira?

thanks
-Simon
Sean Timm wrote:
In the example below, Doc1, and Doc2 will all have the same score forthe query "chevrolet tahoe." We would prefer Doc2 to score higher thanDoc1. The score length norm for each is also 0.5f. I presume which oneappears first now falls to the order they were placed in the index? Byusing our score length norm function, Doc2's score will be multiplied by1.0f and Doc1 by 0.875f resulting in the desired behavior.
Doc1: Chevrolet Tahoe Hybrid 2008
Doc2: Chevrolet Tahoe 2008

-Sean

Mark Miller wrote:
Sean Timm wrote:
To solve this, we wrote our own Similarity class which extendsDefaultSimilarity and maps numTerms 1-10 to precalculated valuesbetween 1.5f and 0.3125f. For numTerms >10, we use the standardformula above. If anyone else is interested in this, I can post thecode as a patch in Jira.
Does this actually have a good measurable affect for you? Wouldn't itmake more sense to just turn off norms for short fields?

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

Reply via email to