Here is what debug says each of these queries parse to: 1. q=life&defType=edismax&qf=Title ... returns 277,635 results 2. q=the life&defType=edismax&qf=Title ... returns 277,635 results 3. q=life&defType=edismax&qf=Title Contributor ... returns 277,635 4. q=the life&defType=edismax&qf=Title Contributor ... returns 0 results
1. +DisjunctionMaxQuery((Title:life)) 2. +((DisjunctionMaxQuery((Title:life)))~1) 3. +DisjunctionMaxQuery((CTBR_SEARCH:life | Title:life)) 4. +((DisjunctionMaxQuery((Contributor:the)) DisjunctionMaxQuery((Contributor:life | Title:life)))~2) I see what's going on here. Because "the" is a stop word for Title, it gets removed from first part of the expression. This means that "Contributor" is required to contain "the". dismax does the same thing too. I guess I should have run debug before asking the mail list! It looks like the only workarounds I have is to either filter out the stopwords in the client when this happens, or enable stop words for all the fields that are used in "qf" with stopword-enabled fields. Unless...someone has a better idea?? James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -----Original Message----- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Wednesday, January 12, 2011 4:44 PM To: solr-user@lucene.apache.org Cc: Jayendra Patil Subject: Re: StopFilterFactory and "qf" containing some fields that use it and some that do not > Have used edismax and Stopword filters as well. But usually use the fq > parameter e.g. fq=title:the life and never had any issues. That is because filter queries are not relevant for the mm parameter which is being used for the main query. > > Can you turn on the debugQuery and check whats the Query formed for all the > combinations you mentioned. > > Regards, > Jayendra > > On Wed, Jan 12, 2011 at 5:19 PM, Dyer, James <james.d...@ingrambook.com>wrote: > > I'm running into a problem with StopFilterFactory in conjunction with > > (e)dismax queries that have a mix of fields, only some of which use > > StopFilterFactory. It seems that if even 1 field on the "qf" parameter > > does not use StopFilterFactory, then stop words are not removed when > > searching any fields. Here's an example of what I mean: > > > > - I have 2 fields indexed: > > > Title is "textStemmed", which includes StopFilterFactory (see below). > > > Contributor is "textSimple", which does not include StopFilterFactory > > > > (see below). > > - "The" is a stop word in stopwords.txt > > - q=life&defType=edismax&qf=Title ... returns 277,635 results > > - q=the life&defType=edismax&qf=Title ... returns 277,635 results > > - q=life&defType=edismax&qf=Title Contributor ... returns 277,635 > > results - q=the life&defType=edismax&qf=Title Contributor ... returns 0 > > results > > > > It seems as if the stop words are not being stripped from the query > > because "qf" contains a field that doesn't use StopFilterFactory. I did > > testing with combining Stemmed fields with not Stemmed fields in "qf" > > and it seems as if stemming gets applied regardless. But stop words do > > not. > > > > Does anyone have ideas on what is going on? Is this a feature or > > possibly a bug? Any known workarounds? Any advice is appreciated. > > > > James Dyer > > E-Commerce Systems > > Ingram Content Group > > (615) 213-4311 > > ________________________________ > > <fieldType name="textSimple" class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > </fieldType> > > > > <fieldType name="textStemmed" class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="stopwords.txt" enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > > generateNumberParts="0" catenateWords="0" catenateNumbers="0" > > catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" > > stemEnglishPossessive="1" /> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.PorterStemFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > > ignoreCase="true" expand="true"/> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="stopwords.txt" enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > > generateNumberParts="0" catenateWords="0" catenateNumbers="0" > > catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" > > stemEnglishPossessive="1" /> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.PorterStemFilterFactory"/> > > </analyzer> > > </fieldType>