: Hmmm, tricky.  I think you've uncovered an algorithmic flaw in DisMax.

I would call it a deficency, not a flaw :)

: more restrictive than the first query.  It appears that dismax is a
: bit broken when some of the fields have stopwords and some don't.
: Offhand, I don't see an easy fix for this problem.

you are correct, this has been discussed in the past...

http://www.nabble.com/Making-stop-words-optional-with-DisMax--to16307924.html#a16307924
http://www.nabble.com/DisMax-request-handler-doesn%27t-work-with-stopwords--to11015905.html#a11015905

It's not a bug in the implementation, it's a side effect of the basic 
tenent of how dismax works since it inverts the input and creates a 
DisjunctionMaxQuery for each "word" in the input, any word that is valid 
in at least one of the "qf" fields generates a "should" clause that 
contributes to the MM count.  

One idea that was suggested at one point (possibly by me) is to make "qf" 
a multivalue param and use each one to construct a seperate boolean query 
and wrap those in another boolean (or dismax) query so that only one is 
required ... then you could put all of your fields w/o stopwords in one qf 
with a bunch of big boosts, and all of your fields w/stopwords in another 
qf with smaller boosts.

But i don't think anyone ever submitted a patch, and i haven't htought it 
through enough to be confident it would work out well.


-Hoss

Reply via email to