Hi Emir, Thanks for your advice. This works.
Regards, Edwin On 1 November 2017 at 18:08, Emir Arnautović <emir.arnauto...@sematext.com> wrote: > Hi, > You can use KeywordTokenizer and LowerCaseTokenFilterFactory. > > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 1 Nov 2017, at 09:50, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > wrote: > > > > Hi, > > > > Would like to find out, what is the best way to lower-case a String index > > in Solr, to make it case insensitive, while preserving the structure of > the > > string (ie It should not break into different tokens at space, and should > > not remove any characters or symbols) > > > > I found that solr.StrField does not use lower case filter. But if I > change > > it to solr.TextField and uses Standard Tokenizer, the fields get broken > up. > > > > Eg: > > > > For this configuration, > > > > <fieldType name="string_lower" class="solr.TextField" > > positionIncrementGap="100" autoGeneratePhraseQueries="false"> > > <analyzer type="index"> > > <tokenizer class="solr.StandardTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.StandardTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > </fieldType> > > > > The string "*SYStem 500 **" gets broken down into this > > > > *system | 500* > > > > The system and 500 are separated into 2 tokens, which is not what we > want. > > Also, the * is being removed. > > > > > > We will like to have something like this. This will preserve what it is > as > > a string but just lowercase it. > > > > *system 500 ** > >