Hi,
Up to now, the best solution I found in order to implement a multi-words
suggester was to use "ShingleFilterFactory" filter at index time and the
termsComponent. At index time the analyzer was :
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.ElisionFilterFactory" ignoreCase="true"
articles="lang/contractions_fr.txt"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ShingleFilterFactory" maxShingleSize="4"
outputUnigrams="true"/>
</analyzer>
With "ASCIIFoldingFilter" filter, it works find if the user do not use
accent in query terms and all suggestions are without accents.
Without "ASCIIFoldingFilter" filter, it works find if the user do not
forget accent in query terms and all suggestions are with accents.
Note : I use the StopFilter to avoid suggestions including stop words
and particularly starting or ending with stop words.
What I need is a suggester where the user can use or not use the accent
in query terms and the suggestions are returned with accent.
For example, if the user type "éco" or "eco", the suggester should return :
école
école primaire
école publique
école privée
école primaire privée
I think it is impossible to achieve this with the termComponents and I
should use the SpellCheckComponent instead. However, I don't see how to
make the suggester accent insensitive and return the suggestions with
accents.
Did somebody already achieved that ?
Thank you.
Dominique