Hi,
You can use KeywordTokenizer and LowerCaseTokenFilterFactory.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 1 Nov 2017, at 09:50, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
> 
> Hi,
> 
> Would like to find out, what is the best way to lower-case a String index
> in Solr, to make it case insensitive, while preserving the structure of the
> string (ie It should not break into different tokens at space, and should
> not remove any characters or symbols)
> 
> I found that solr.StrField does not use lower case filter. But if I change
> it to solr.TextField and uses Standard Tokenizer, the fields get broken up.
> 
> Eg:
> 
> For this configuration,
> 
> <fieldType name="string_lower" class="solr.TextField"
> positionIncrementGap="100" autoGeneratePhraseQueries="false">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
>   </fieldType>
> 
> The string "*SYStem 500 **" gets broken down into this
> 
> *system | 500*
> 
> The system and 500 are separated into 2 tokens, which is not what we want.
> Also, the * is being removed.
> 
> 
> We will like to have something like this. This will preserve what it is as
> a string but just lowercase it.
> 
> *system 500 **

Reply via email to