Hi All

We've got an index in which we have a multiValued field per document.

Assume the multivalue field values in each document to be;

Doc1:
bar lifters

Doc2:
truck tires
back drops
bar lifters

Doc 3:
iron bar lifters

Doc 4:
brass bar lifters
iron bar lifters
tire something
truck something
oil gas

Now when we search for 'bar lifters' the expectation (based on the
requirements) is that we get results in the order of Doc1, Doc 2, Doc4 and
Doc3.
Doc 1 - since there's an exact match (and only one) for the search terms
Doc 2 - since ther'e an exact match amongst the values
Doc 4 - since there's a partial match on the values but the number of
matches are more than Doc 3
Doc 3 - since there's a partial match

However, the results come out as Doc1, Doc3, Doc2, Doc4. Looking at the
explaination of the result it appears Doc 2 is loosing to Doc3 and Doc 4 is
loosing to Doc3 based on length normalisation.

We think we can see the reason for that - the field length in doc2 is
greater than doc3 and doc 4 is greater doc3.
However, is there any mechanism I can force doc2 to beat doc3 and doc4 to
beat doc3 with this structure.

We did look at using omitNorms=true, but that messes up the scores for all
docs. The result comes out as Doc4, Doc1, Doc2, Doc3 (where Doc1, Doc2 and
Doc3 gets the same score)
This is because the fieldNorm is not taken into account anymore (as
expected) and the termFrequence being the only contributing factor. So
trying to avoid length normalisation through omitNorms is not helping.

Is there anyway where we can influence an exact match of a value in a
multiValue field to add on to the overall score whilst keeping the lenght
normalisation?

Hope that makes sense.

Cheers
-- Imran

Reply via email to