I am trying to filter russian stopwords but have not been successful with
that. I am using the following schema entry -

.....
 <fieldType name="text" class="solr.TextField" >
   <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true"                                                              
expand="false"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
......

Intrestingly, Russian synonyms are working fine. English and russian
synonyms get searched correctly.

Also,If I add an English language word to stopwords.txt it gets filtered
correctly. Its the russian words that are not getting filtered as stopwords.

Can someone explain the behaviour.

Thanks,
Tushar.
-- 
View this message in context: 
http://www.nabble.com/Russian-stopwords-tp20851093p20851093.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to