: Hmm, that makes sense to me - however I still think that even if we have mm : set to "2" and we have "the 7449078" it should still match 7449078 in a : productId field (it does not: : http://zeta.zappos.com/search?department=&term=the+7449078). This seems like : it works against the way one would reasonably expect it to - that stopwords : shouldn't impact the counts for mm (so, "the 7449078" would count as 1 term : for mm since "the" is a stopword).
this is back to the original "problem"... "stopwords" is an analyzer concept; "minShouldMatch" is BooleanQuery/DisMaxQueryParser concept ... if all of the analyzers for all of your fields agree on the list of stopwords, then q=the+7449078 will result in "the" getting thrown out and you'll only have one clause. but if one of fields has an anayler that says "the" is a valid term, then it's a valid term and it gets a clause in the query. if it gets a clause in the query, then it factors into the minShouldMatch calculation. in that particular situation i believe the solution you want is to use the same stopwords like you have on other fields for your productId field as well, so "the" doesn't get a query clause at all ... unless you want q=the+7449078 to return product#7449078 if and only if it also has "the" in it's productId field. : We have people asking for "the north" to return results from a brand called : "the north face" - but it doesn't, and can't, because of this mm issue. it may not work for you right now, but that doesn't mean it can't :) ... i'm not sure why it wouldn't actually. consider a query like this... q=the north&qf=manu^2 prodName^1 desc^0.5&pf=...&mm=66% let's say that "desc" uses stop words, but prodName and manu don't (because we know we have manufacturer and product names like "the north face"). we're going to get one DisjunctionMaxQuery for "the" (on the manu and prodName fields) and one DisjunctionMaxQuery for "north" (on manu, prodName, and desc) and that's 2 clauses on a BooleanQuery whose mminShouldMatch is going to be 2 (because 66% of 2 rounded up is 2) so now all products with "the" and "north" in their manufacturer name *OR* product name will match -- even if it's "the" in manu and "north" in prodName. products will even match if the only place they contain "north" is in the description -- but only if they also contain "the" in manu or productName. if you think "that's silly, why is 'the' required i want it to be a stopword!" then the solution is make it a stopword *everywhere* (inlcuding manu and prodName) ... since it's not a stopword, it's considered significant, so it needs to match. -Hoss