Re: Making a String field case-insensitive

Zheng Lin Edwin Yeo Wed, 01 Nov 2017 19:09:13 -0700

Hi Emir,

Thanks for your advice. This works.


Regards,
Edwin


On 1 November 2017 at 18:08, Emir Arnautović <emir.arnauto...@sematext.com>
wrote:

> Hi,
> You can use KeywordTokenizer and LowerCaseTokenFilterFactory.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 1 Nov 2017, at 09:50, Zheng Lin Edwin Yeo <edwinye...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > Would like to find out, what is the best way to lower-case a String index
> > in Solr, to make it case insensitive, while preserving the structure of
> the
> > string (ie It should not break into different tokens at space, and should
> > not remove any characters or symbols)
> >
> > I found that solr.StrField does not use lower case filter. But if I
> change
> > it to solr.TextField and uses Standard Tokenizer, the fields get broken
> up.
> >
> > Eg:
> >
> > For this configuration,
> >
> > <fieldType name="string_lower" class="solr.TextField"
> > positionIncrementGap="100" autoGeneratePhraseQueries="false">
> > <analyzer type="index">
> > <tokenizer class="solr.StandardTokenizerFactory"/>
> > <filter class="solr.LowerCaseFilterFactory"/>
> > </analyzer>
> > <analyzer type="query">
> > <tokenizer class="solr.StandardTokenizerFactory"/>
> > <filter class="solr.LowerCaseFilterFactory"/>
> > </analyzer>
> >   </fieldType>
> >
> > The string "*SYStem 500 **" gets broken down into this
> >
> > *system | 500*
> >
> > The system and 500 are separated into 2 tokens, which is not what we
> want.
> > Also, the * is being removed.
> >
> >
> > We will like to have something like this. This will preserve what it is
> as
> > a string but just lowercase it.
> >
> > *system 500 **
>
>

Re: Making a String field case-insensitive

Reply via email to