My stopwords don't works as expected.
Here is part of my schema:
 <fieldType name="text_general" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>
 <fieldType class="solr.TextField" name="text_auto">
        <analyzer type="index">
            <charFilter class="solr.HTMLStripCharFilterFactory"/>
            <tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="false"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            <filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="true" outputUnigramsIfNoShingles="false"/>
        </analyzer>
        <analyzer type="query">
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="false"/>
        </analyzer>
    </fieldType>
 <field name="deal_title_terms" type="text_auto" indexed="true"
stored="false" required="false" multiValued="true"/>
    <field name="deal_description" type="text_general" indexed="true"
stored="true" required="false" multiValued="false"/>
In stopwords.txt I have next words: the, is, a;
Also I have next data in my fields:

deal_description - This is the my description
deal_title_terms - This is the deal title a terms (will be splitted in
terms)

When I try to search deal_description:
Example 1: "deal_description: *his is the m*" - I expect that document with
deal_description "This is the my description" will be returned
Example 2: "deal_description: *is th*" - I expect that nothing will be
found because "is" and "the" are stopwords.

When I try to search deal_title_terms:
Example 1: "deal_title_terms: *is*" - I expect that nothing will be found
because "is" is stopword.
Example 2: "deal_title_terms: *is the deal*" - I expect that "is" and "the"
will be ignored and term "deal" will be found.
Example 3: "deal_title_terms: *title a terms*" - I expect that "a" will be
ignored and term "title terms" will be found.

Question 1: Why stopwords don't works for "deal_description" field ?
Question 2: Why for field "deal_title_terms" stopwords not removed for my
query ?(When I am trying to find *title a terms* it will not find "title
terms" term)
Question 3: Is there any way to show stopwords in search result but prevent
them from searching ? Example:

data: This is cool search engine
search query : "*is coo*" -> return "This is cool search engine"
search query : "*is*" -> return nothing
search query : "*This coll*" -> return "This is cool search engine"

Question 4: *Where I can find detailed description (maybe with examples)
how stopwords works in solr ? Because it looks like magic.*

Reply via email to