Re: truncate string field type

Erick Erickson Sun, 08 Jul 2018 11:05:00 -0700

Why do you want to add such long strings to your index in the first
place? There are almost useless for search, you want tokenized
(text_general is a good place to start) if you want to search for
words within the string.


"The number of bytes limit" is 32K or so, right? What do you want to
do with the data going in there?

There may be good reasons, but I've seen confusion around strings in the past.

Best,
Erick

On Sat, Jul 7, 2018 at 11:12 PM, Alexandre Rafalovitch
<arafa...@gmail.com> wrote:
> Did you look into UpdateRequestProcessors?
>
> There is a truncate one there.
>
> Regards,
>     Alex
>
> On Sun, Jul 8, 2018, 12:44 AM Zahra Aminolroaya, <z.aminolro...@gmail.com>
> wrote:
>
>> I want to truncate my string field type due to its number of bytes limit. I
>> wrote the following in my schema:
>>
>>
>> <fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
>>   <analyzer type="index">
>>       <tokenizer class="solr.KeywordTokenizerFactory"/>
>>       <filter class="solr.TruncateTokenFilterFactory"
>> prefixLength="32700"/>
>>    </analyzer>
>>    <analyzer type="query">
>>       <tokenizer class="solr.KeywordTokenizerFactory"/>
>>       <filter class="solr.TruncateTokenFilterFactory"
>> prefixLength="32700"/>
>>    </analyzer>
>> </fieldType>
>>
>> However, I found that StrField (string) does not support specifying an
>> analyzer. Besides, prefixLength in TruncateTokenFilterFactory could not be
>> more than 1000.
>>
>> I want to have the same application of string. Do you think it is
>> reasonable
>> to use  "text_general" field type with solr.KeywordTokenizerFactory filter
>> to have the same application? Do I lose any feature?
>>
>> If I use text_general, it is not needed to truncate.
>>
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Re: truncate string field type

Reply via email to