Thanks, I'm inclined not to even bother with stopwords as in my case I've got 
a fairly small dataset and leaving them in doesn't seem to have a noticeable 
effect on performance.

On Thursday 12 February 2009 15:41:05 Jeff Newburn wrote:
> Unfortunately, the stopword filter acts funny (depending on who you ask) in
> dismax.  The short version is that the stopwords filter has to be on all
> fields being queried on for minimum matches to work.  We have the same
> issue with one of our brands.  We require all word matching so "The North
> Face" would never return any results because "the" is filtered out but mm
> still requires all 3 words to match.
>
> So basically all fields MUST be put through the stopword filter for dismax
> to not have the issue.
>
> The long version of the answer can be found here:
> http://www.nabble.com/Dismax-Minimum-Match-Stopwords-Bug-td20960507.html
>
>
> On 2/11/09 8:49 AM, "Steven Hentschel" <steven.hentsc...@googlemail.com>
>
> wrote:
> > If a naive user enters a string that contains typical stopwords like
> > "and" and "the", these seem to be included in the word count for the must
> > match criteria of the the dismax query.
> >
> > So, if for example the mm parameter is the default " 2>-1 5>-2
> > 6>90%" and the user enters something like "Jason and the Argonauts",
> > this won't match a document with that title because the word count is
> > treated as 4 and only 2 words match. As the dismax query is
> > recommended for naive users, wouldn't be more logical to apply the
> > mm criteria after applying the stopword filter on the query?
> >
> > Steven H.

Reply via email to