> > When I request with title:econ* I can have the correct answers, but if I > request with title:écon* I have no answers. > If I request with title:économ (the exact word of the index) it works, so > there might be something wrong with the wildcard. > As far as I can understand the analyser should be use exactly the same in > both index and query time. > Wildcard queries are not analyzed and hence the "inconsistent" behaviour. The easiest way out is to define one more field "title_orginal" as an untokenized field. While querying, you can use both the fields at the same time. e.g. q=(title:écon* title_orginal:écon*). In any case, you would get desired matches.
Cheers Avlesh On Fri, Oct 30, 2009 at 9:19 PM, Nicolas Leconte <nicolas.ai...@aidel.com>wrote: > Hi all, > > I have a field that contains accentuated char in it, what I whant is to be > able to search with ignore accents. > I have set up that field with : > <analyzer> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StandardFilterFactory"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="1" /> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <filter class="solr.SnowballPorterFilterFactory" language="French"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.ISOLatin1AccentFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > > In the index the word "économie" is translated to "econom", the accent is > removed thanks to the ISOLatin1AccentFilterFactory and the end of the word > removent thanks to the SnowballPorterFilterFactory. > > When I request with title:econ* I can have the correct answers, but if I > request with title:écon* I have no answers. > If I request with title:économ (the exact word of the index) it works, so > there might be something wrong with the wildcard. > As far as I can understand the analyser should be use exactly the same in > both index and query time. > > I have tested with changing the order of the filters (putting the > ISOLatin1AccentFilterFactory on top) without any result. > > Could anybody help me with that and point me what may be wrong with my > shema ? >