Tks for the explain now I can clearly understand why it doesn't work as I was expecting :)

jfmel...@free.fr a écrit :
if the request contains any wilcard then filters are not called :
no ISOLatin1AccentFilterFactory and no SnowballPorterFilterFactory  !

"économie" is indexed to "econom"

solr don't found :
 - term starts with "éco"     (éco*)
 - term starts with "economi" (economi*)

if you index manger, mangé and mangue, the indexed terms will be mang and mangu

requests  ->  results

manger   ->   mange, mangé
mangé    ->   mange, mangé
mang     ->   mange, manger
mangu    ->   mangue
mang*    ->   manger, mangé, mangue
mang?    ->   mangue  (and not mangé)
mangé*   ->   nothing

Jean-François


----- "Nicolas Leconte" <nicolas.ai...@aidel.com> a écrit :

| Hi all,
| | I have a field that contains accentuated char in it, what I whant is | to | be able to search with ignore accents.
| I have set up that field with :
| <analyzer>
| <tokenizer class="solr.StandardTokenizerFactory"/>
| <filter class="solr.StandardFilterFactory"/>
| <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
| | generateNumberParts="1" catenateWords="1" catenateNumbers="1" | catenateAll="0" splitOnCaseChange="1" />
| <filter class="solr.LowerCaseFilterFactory"/>
| <filter class="solr.StopFilterFactory" ignoreCase="true" | words="stopwords.txt" />
| <filter class="solr.SnowballPorterFilterFactory" language="French"/>
| <filter class="solr.LowerCaseFilterFactory"/>
| <filter class="solr.ISOLatin1AccentFilterFactory"/>
| <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
| </analyzer>
| | In the index the word "économie" is translated to "econom", the | accent | is removed thanks to the ISOLatin1AccentFilterFactory and the end of | the | word removent thanks to the SnowballPorterFilterFactory. | | When I request with title:econ* I can have the correct answers, but | if | I request with title:écon* I have no answers.
| If I request with title:économ (the exact word of the index) it works,
| | so there might be something wrong with the wildcard.
| As far as I can understand the analyser should be use exactly the same
| | in both index and query time. | | I have tested with changing the order of the filters (putting the | ISOLatin1AccentFilterFactory on top) without any result. | | Could anybody help me with that and point me what may be wrong with my | | shema ?



Reply via email to