It's not the ASCII folding filter but the stemmer that's removing some trailing characters. Something you can easily spot on the analysis page.
> Here is the field type definition for ‘text’ field which is what I am using > for the indexed fields. Can you guys notice any obvious filter that could > be the issue? > > --------------------------------------------------------------------------- > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <!-- in this example, we will only use synonyms at query time > > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > > --> > > <!-- Case insensitive stop word removal. > > add enablePositionIncrements=true in both the index and query > > analyzers to leave a 'gap' for more accurate phrase queries. > > --> > > <filter class="solr.StopFilterFactory" > > ignoreCase="true" > > words="stopwords.txt" > > enablePositionIncrements="true" > > /> > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/> > > <filter > class="solr.ASCIIFoldingFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > > <filter class="solr.StopFilterFactory" > > ignoreCase="true" > > words="stopwords.txt" > > enablePositionIncrements="true" > > /> > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/> > > </analyzer> > > </fieldType> > > > > From: Steven A Rowe [mailto:sar...@syr.edu] > Sent: Tuesday, April 05, 2011 12:28 PM > To: solr-user@lucene.apache.org > Subject: RE: question on solr.ASCIIFoldingFilterFactory > > > > I added this test method locally to TestASCIIFoldingFilter.java in the > Lucene/Solr 3.1.0 source > > tree, and it passed, so the filter is not the problem (and the Solr factory > certainly isn't > > either - it's just a wrapper) - I second Ludovic's question - you must have > other filters > > configured: > > > > public void testPluralNotTrimmed() throws Exception { > > TokenStream stream = new WhitespaceTokenizer(TEST_VERSION_CURRENT, new > StringReader > > ("después Imágenes")); > > ASCIIFoldingFilter filter = new ASCIIFoldingFilter(stream); > > CharTermAttribute termAtt = > filter.getAttribute(CharTermAttribute.class); > > > > assertTermEquals("despues", filter, termAtt); > > assertTermEquals("Imagenes", filter, termAtt); > > } > > > > Steve