Thanks, I'm inclined not to even bother with stopwords as in my case I've got a fairly small dataset and leaving them in doesn't seem to have a noticeable effect on performance.
On Thursday 12 February 2009 15:41:05 Jeff Newburn wrote: > Unfortunately, the stopword filter acts funny (depending on who you ask) in > dismax. The short version is that the stopwords filter has to be on all > fields being queried on for minimum matches to work. We have the same > issue with one of our brands. We require all word matching so "The North > Face" would never return any results because "the" is filtered out but mm > still requires all 3 words to match. > > So basically all fields MUST be put through the stopword filter for dismax > to not have the issue. > > The long version of the answer can be found here: > http://www.nabble.com/Dismax-Minimum-Match-Stopwords-Bug-td20960507.html > > > On 2/11/09 8:49 AM, "Steven Hentschel" <steven.hentsc...@googlemail.com> > > wrote: > > If a naive user enters a string that contains typical stopwords like > > "and" and "the", these seem to be included in the word count for the must > > match criteria of the the dismax query. > > > > So, if for example the mm parameter is the default " 2>-1 5>-2 > > 6>90%" and the user enters something like "Jason and the Argonauts", > > this won't match a document with that title because the word count is > > treated as 4 and only 2 words match. As the dismax query is > > recommended for naive users, wouldn't be more logical to apply the > > mm criteria after applying the stopword filter on the query? > > > > Steven H.