Re: How to boost the score higher in case user query matches entire field value than just some words within a field

Sean Timm Thu, 21 Aug 2008 06:31:21 -0700

Length normalization in the Similarity class will generally favorshorter fields. For example, with the DefaultSimilarity, the lengthnorm for a 2 term field is 0.625. For a three term field it is 0.5.The norm is multiplied by the score.

I say "generally will favor" because the length norm value which iscalculated as

   (float)(1.0 / numTerms)

is stored in the index as a single byte (instead of four bytes), thuslosing precision. This works fine for searching larger documents suchas web pages or news articles, but it can cause some problems when youare simply searching on short fields such as product names or articletitles.

To solve this, we wrote our own Similarity class which extendsDefaultSimilarity and maps numTerms 1-10 to precalculated values between1.5f and 0.3125f. For numTerms >10, we use the standard formula above.If anyone else is interested in this, I can post the code as a patch inJira.


-Sean

Simon Hu wrote:

Hi

I have a text field named prodname in the solr index. Lets say there are 3
document in the index and  here are the field values for prodname field:

Doc1: cordless drill
Doc2: cordless drill battery

Doc3: cordless drill charger

Searching for prodname:"cordless drill" will hit all three documents.  So

how can I make Doc1 score higher than the other two?BTW, I am using solr1.2.thanks!-Simon

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

Reply via email to