Making a String field case-insensitive

Zheng Lin Edwin Yeo Wed, 01 Nov 2017 01:50:38 -0700

Hi,

Would like to find out, what is the best way to lower-case a String index
in Solr, to make it case insensitive, while preserving the structure of the
string (ie It should not break into different tokens at space, and should
not remove any characters or symbols)


I found that solr.StrField does not use lower case filter. But if I change
it to solr.TextField and uses Standard Tokenizer, the fields get broken up.

Eg:

For this configuration,

<fieldType name="string_lower" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
   </fieldType>

The string "*SYStem 500 **" gets broken down into this

*system | 500*

The system and 500 are separated into 2 tokens, which is not what we want.
Also, the * is being removed.


We will like to have something like this. This will preserve what it is as
a string but just lowercase it.

*system 500 **

Making a String field case-insensitive

Reply via email to