Thank you! That makes sense. --Casey >>> Mike Klaas <[EMAIL PROTECTED]> 6/7/2007 2:35 PM >>> On 7-Jun-07, at 1:41 PM, Casey Durfee wrote:
> It appears that if your search terms include stopwords and you use > the DisMax request handler, you get no results whereas the same > search with the standard request handler does give you results. Is > this a bug or by design? There is a subtlety with stopwords and dismax. Imagine a search "what's in python", using a typical analyzer with stopwords for fields such as title, inlinks, rawText, but a more restrictive analyzer for fields such as url, that have no stopwords. For the above search using the following weight function title^1.2 inlinks^1.4 rawText^1.0 produces the following parsed query string +( ( (rawText:what | inlinks:what^1.4 | title:what^1.2)~0.01 (rawText:python | inlinks:python^1.4 | title:python^1.2)~0.01 )~2 ) (rawText:"what python"~5 | inlinks:"what python"~5^1.4 | title:"what python"~5^1.2)~0.01 while the same query with a weight function of title^1.2 inlinks^1.4 rawText^1.0 url^1.0 produces this query string +( ( (rawText:what | url:what | inlinks:what^1.4 | title:what^1.2)~0.01 (url:in)~0.01 (rawText:python | url:python | inlinks:python^1.4 | title:python^1.2)~0.01 )~3 ) (rawText:"what python"~5 | url:"what in python"~5 | inlinks:"what python"~5^1.4 | title:"what python"~5^1.2)~0.01 Note the latter includes a term (url:in)~0.01 on its own. This interacts poorly when using a high mm (minimum #clauses match) setting with dismax, as it effectively requires 'in' to be in the url column, which was probably not the intent of the query. -Mike