Hi,
Scenario:
User who perform search forget to put punctuation mark (apostrophe) for ex,
when user wants to search for a value like INT'L, they just key in INTL
(with no punctuation). In this scenario, I wish to return both values with
INTL and INT'L that currently are indexed on SOLR instance. Currently, if I
search for INTL it wont return the row having value INT'L.
Schema Configuration entry for the field type:
<fieldType name="customStr" class="solr.TextField"
positionIncrementGap="100" sortMissingLast="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory"
pattern="\s*[,.]\s*" replacement=" " replace="all" />
<filter class="solr.PatternReplaceFilterFactory" pattern="\s+"
replacement=" " replace="all" />
<filter class="solr.PatternReplaceFilterFactory" pattern="[';]"
replacement="" replace="all" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="\s*[,.]\s*" replacement=" " replace="all" />
<filter class="solr.PatternReplaceFilterFactory" pattern="\s+"
replacement=" " replace="all" />
<filter class="solr.PatternReplaceFilterFactory" pattern="[';]"
replacement="" replace="all"/>
</analyzer>
</fieldType>
Please suggest as to what mechanism should I use to fetch both the values
like INTL and INT'L, when the search is performed for INTL. Also, does the
reg-ex look correct for the analyzers? What all different filters/ tokenizer
can be used to overcome this issue.
Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510.html
Sent from the Solr - User mailing list archive at Nabble.com.