Hi,

Up to now, the best solution I found in order to implement a multi-words suggester was to use "ShingleFilterFactory" filter at index time and the termsComponent. At index time the analyzer was :

      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ShingleFilterFactory" maxShingleSize="4" outputUnigrams="true"/>
      </analyzer>


With "ASCIIFoldingFilter" filter, it works find if the user do not use accent in query terms and all suggestions are without accents. Without "ASCIIFoldingFilter" filter, it works find if the user do not forget accent in query terms and all suggestions are with accents.

Note : I use the StopFilter to avoid suggestions including stop words and particularly starting or ending with stop words.


What I need is a suggester where the user can use or not use the accent in query terms and the suggestions are returned with accent.

For example, if the user type "éco" or "eco", the suggester should return :

école
école primaire
école publique
école privée
école primaire privée


I think it is impossible to achieve this with the termComponents and I should use the SpellCheckComponent instead. However, I don't see how to make the suggester accent insensitive and return the suggestions with accents.

Did somebody already achieved that ?

Thank you.

Dominique

Reply via email to