To answer my own question, and this sucks :)  the minShingleSize isn't
set in at least 1.4.2.  I'm guessing a later version though?

On Tue, Sep 14, 2010 at 5:49 PM, Jason Rutherglen
<jason.rutherg...@gmail.com> wrote:
> <fieldType name="text_shingle4" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords.txt"/>
> <filter class="solr.ShingleFilterFactory" minShingleSize="4"
> maxShingleSize="4" outputUnigrams="false"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
> I'm using for a field, indexing, then looking at the terms component.
> I'm seeing shingles that consist of only 2 terms, whereas I'm
> expecting all the terms to be at least 4 terms... What's up?  Thanks.
>

Reply via email to