On 5/15/2015 8:49 AM, Charles Sanders wrote: > I'm seeing a problem with the LengthFilter. It appears to work fine until I > increase the max value above 254. At the point it stops removing the very > large token from the stream. As a result I get the error: > java.lang.IllegalArgumentException: Document contains at least one immense > term...... UTF8 encoding is longer than the max length 32766 > > I'm certain I'm doing this wrong. Can someone please show me the light. :) > > <fieldType name="text_std" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LengthFilterFactory" min="1" max="254" /> > </analyzer> > </fieldType>
So with max="254", you don't get the error? Looking at the code for LengthFilter, I can't see any way for it to behave differently with a max of 254 vs. a max of 255 or higher. All of the interfaces and classes involved use "int" for length, which means it should work perfectly with numbers above 254. Thanks, Shawn