Like this:

    <!-- Like a string class, but lower cased -->
    <fieldType name="string_lower" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Oct 11, 2016, at 7:43 AM, Ahmet Arslan <iori...@yahoo.com.INVALID> wrote:
> 
> Hi,
> 
> KeywordTokenizer and LowerCaseFilter should suffice. Optionally you can add 
> TrimFilter too.
> 
> Ahmet
> 
> 
> On Tuesday, October 11, 2016 5:24 PM, Zheng Lin Edwin Yeo 
> <edwinye...@gmail.com> wrote:
> Hi,
> 
> Would like to find out, what is the best way to lowercase all the text,
> while preserving all the tokens.
> 
> As I need to preserve every character of the text (including symbols and
> white space), I'm using String. However, I can't put the
> LowerCaseFilterFactory in String.
> 
> I found that we can use WhitespaceTokenizerFactory, followed by
> LowerCaseFilterFactory. Although WhitespaceTokenizerFactory can preserve
> the symbols, it will still split on Whitespace, which is what we do not
> want. This is because we may have words like 'One' and 'One Way'. If we use
> the WhitespaceTokenizerFactory and search for 'One', it will return records
> with 'One Way' too, which is what we do not want.
> 
> Is there other way which we can achieve this?
> 
> I'm using Solr 6.2.1.
> 
> Regards,
> Edwin

Reply via email to