<fieldType name="text_shingle4" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="4"
maxShingleSize="4" outputUnigrams="false"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>


I'm using for a field, indexing, then looking at the terms component.
I'm seeing shingles that consist of only 2 terms, whereas I'm
expecting all the terms to be at least 4 terms... What's up?  Thanks.

Reply via email to