Re: dismax: limiting term match to one field

Jan Kurella Fri, 10 Dec 2010 00:23:33 -0800

On 09.12.2010 21:26, ext Chris Hostetter wrote:

: doc1 is name=A B category=B
: doc2 is name=A category=B
:
: when searching for the terms "A" and "B" I want doc2 to get a higher score.
: to be more specific, I don't want the term "B" to influence doc1's score in
: both<name>  and<category>, only in one of them.


if you set the boost value of category to something very high, and set
tie=0 you should get the exact behavior you describe.

with tie=0, each clause (ie: "B") will only get a score contribution from
the highest scoring field -- if the qf boost value for category is
significantly higher then the boost value for "name" this should work
fine.

this is one of hte prime usecases for dismax: a "category" or "doc_type"
field that has a very small finite set of values in it which frequently
doesn't match anythin users type, but you configure it with a hight boost
value so when it *does* match something the user types, it causes
documents in that category (or having htat document_type) to dominate over
other documents.

-Hoss

yup, you actually formulated the main usecase for the dismax Queryhandler ;)

What you are probably stumbling about is - according to your examplefrom the beginning - the basic scoring and therefore the weights to set.

 query: qf=name^5 category&q=pulp fiction&mm=2

Given a category field I assume, that the total number of tokens in hereis rather small compared to your title field, thus the idf is low foreach term, hitting in here. The idf for a token in the title field isprobably rather high. Thus by default, a hit in the title would scorehigher. => boost the category field would be the easy solutionSecond you might have more than one category word in the category field?If so, the field normalization would also rate a hit in here down. Ithink it could help to deactivate norms for this field (omitNorms=truein the field type configuration)If this is not enough you can go into the similarity and change theimplementation according to the field given


search time: idfExplain(...)/idf()
index time: lengthNorm(...)/computeNorm(...)

Re: dismax: limiting term match to one field

Reply via email to