On Thu, 2007-09-20 at 13:33 +0200, Thierry Collogne wrote: > We are using this schema definition >
Thierry, try to move the solr.ISOLatin1AccentFilterFactory up the filter cue, like: ... <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ISOLatin1AccentFilterFactory"/> ... for both indexing and query. This way you make sure that all accent are gone before you do further filtering. You may need to reindex all documents to make sure we are not going to use the old index. HTH salu2 > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <!-- in this example, we will only use synonyms at query time > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > --> > <filter class="solr.StopFilterFactory" ignoreCase="true" words=" > stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" protected=" > protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.ISOLatin1AccentFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" words=" > stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" protected=" > protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.ISOLatin1AccentFilterFactory"/> > </analyzer> > </fieldType> > > I will take a look at the analyzer took. > > Thank you both for the quick response. > > On 20/09/2007, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: > > > > On 9/20/07, Thierry Collogne <[EMAIL PROTECTED]> wrote: > > > > > ..when we search for "matthé" or for "matthe", we get two totally > > > different results.... > > > > The analyzer admin tool should help you find out what's happening, see > > > > http://wiki.apache.org/solr/FAQ#head-b25df8c8393bbcca28f1f344c432975002e29ca9 > > > > -Bertrand > > -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions