Use the Solr Admin UI analysis page to see how the text is analyzed at both index and query time.
My e-book does have more narrative and examples for stop word processing: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky On Tue, Mar 31, 2015 at 5:41 PM, Alex Sylka <sylkaa...@gmail.com> wrote: > My stopwords don't works as expected. > Here is part of my schema: > <fieldType name="text_general" class="solr.TextField"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > <fieldType class="solr.TextField" name="text_auto"> > <analyzer type="index"> > <charFilter class="solr.HTMLStripCharFilterFactory"/> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="false"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.ShingleFilterFactory" maxShingleSize="3" > outputUnigrams="true" outputUnigramsIfNoShingles="false"/> > </analyzer> > <analyzer type="query"> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="false"/> > </analyzer> > </fieldType> > <field name="deal_title_terms" type="text_auto" indexed="true" > stored="false" required="false" multiValued="true"/> > <field name="deal_description" type="text_general" indexed="true" > stored="true" required="false" multiValued="false"/> > In stopwords.txt I have next words: the, is, a; > Also I have next data in my fields: > > deal_description - This is the my description > deal_title_terms - This is the deal title a terms (will be splitted in > terms) > > When I try to search deal_description: > Example 1: "deal_description: *his is the m*" - I expect that document with > deal_description "This is the my description" will be returned > Example 2: "deal_description: *is th*" - I expect that nothing will be > found because "is" and "the" are stopwords. > > When I try to search deal_title_terms: > Example 1: "deal_title_terms: *is*" - I expect that nothing will be found > because "is" is stopword. > Example 2: "deal_title_terms: *is the deal*" - I expect that "is" and "the" > will be ignored and term "deal" will be found. > Example 3: "deal_title_terms: *title a terms*" - I expect that "a" will be > ignored and term "title terms" will be found. > > Question 1: Why stopwords don't works for "deal_description" field ? > Question 2: Why for field "deal_title_terms" stopwords not removed for my > query ?(When I am trying to find *title a terms* it will not find "title > terms" term) > Question 3: Is there any way to show stopwords in search result but prevent > them from searching ? Example: > > data: This is cool search engine > search query : "*is coo*" -> return "This is cool search engine" > search query : "*is*" -> return nothing > search query : "*This coll*" -> return "This is cool search engine" > > Question 4: *Where I can find detailed description (maybe with examples) > how stopwords works in solr ? Because it looks like magic.* >