I am using solr 1.4.1.   I am trying to index a spanish field using the
following tokenizer/filters:

    
      
        
        
        
        
        

Using field analysis solr Admin i can tell StopFilterFactory and
SnowballPorterFilterFactory with Spanish not working right:

1. after stopFilter, "la" should be gone, but it is not.
2. after snowballporterFilterFactory(language=Spanish), "cöcktäils" should
become "cöcktäil".  But i still see the token "cöcktäils" coming out.

I configured a spanish stopword list for the StopFilterFactory.

Field name: title_name
field value:  la Cöcktäils


Index Analyzer
=========================================================================
org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position   1       2
term text       la      Cöcktäils
term type       word    word
source start,end        0,2     3,12
payload                 

=============================================================================
org.apache.solr.analysis.StopFilterFactory {words=stopwords_es.txt,
ignoreCase=true}
term position   1       2
term text       la      Cöcktäils
term type       word    word
source start,end        0,2     3,12
payload 
==============================================================================  
org.apache.solr.analysis.LowerCaseFilterFactory {}
term position   1       2
term text       la      cöcktäils
term type       word    word
source start,end        0,2     3,12
payload
=============================================================================== 
                
org.apache.solr.analysis.SnowballPorterFilterFactory {language=Spanish}
term position   1       2
term text       la      cöcktäils
term type       word    word
source start,end        0,2     3,12
payload                 
===============================================================================
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position   1       2
term text       la      cöcktäils
term type       word    word
source start,end        0,2     3,12
payload                 
==============================================================================



I just copied the text from this URL to form my stopwords_es.txt:

http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/resources/org/apache/lucene/analysis/snowball/spanish_stop.txt



Look forward to your help...

--
View this message in context: 
http://lucene.472066.n3.nabble.com/stopFilterFactor-and-SnowballPorterFilterFactory-not-work-for-Spanish-tp2684322p2684322.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to