: Interesting, omitTf=true eventhough it would give strict enforcement,
: wouldnt it affect the relevancy? Like, I am wondering if the ordering
: amongst the three word matches would be not as good as it would be when we
: have omitNorms=true&omitTf=true. Do you have an idea?

It will *absolutely* affect the ranking ... that's the entire point.

if the complaint is "docA containig only two of the clauses 
scores higher then docB matching all 3 clauses" the reason for that is 
(usually) because tf/idf scoring for docA is a *REALLY* good match for 
those two clauses (ie: they occur many, many times) where as docB might 
match all three but it may only match each of them once.  you can't 
garuntee a strict ordering based on number of clauses that match unless 
you eliminate term freq and norms from the equation.

That said, i realize now that i forgot to finish my previous message with 
the "However..." comment...

However... if you still want the tf/idf and length norm to be a factor, 
but you just want to change the "penalty" of not matching all terms to be 
much higher (which doesn't garuntee a strict ordering, but biases things 
so much it's unlikely to ever be a factor) you could also play arround 
with a a custom implemntation of the coord factor in the similarity...

http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/Similarity.html#coord%28int,%20int%29

: > : I want to have a strick enforcement that In case of a 3 word search,
: > those
: > : results that match all 3 term should be presented ahead of those that
: > match
: > : 2 terms when I set mm=2.
: > :
: > : I have seen quite some cases where, those results that match 2 out of 3
: > : words appear ahead of those matching all 3 words.
: >
: > which can happen because of tf/idf and length normalization.
: >
: > if you disable all of those things for hte fields you
: > search on (omitNorms=true omitTf=true) you should see a strict ordering
: > based on the number of matching clauses.


-Hoss

Reply via email to