And here's the issue... https://issues.apache.org/jira/browse/SOLR-1740
On Tue, Sep 14, 2010 at 6:08 PM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: > To answer my own question, and this sucks :) the minShingleSize isn't > set in at least 1.4.2. I'm guessing a later version though? > > On Tue, Sep 14, 2010 at 5:49 PM, Jason Rutherglen > <jason.rutherg...@gmail.com> wrote: >> <fieldType name="text_shingle4" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer> >> <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> <filter class="solr.ShingleFilterFactory" minShingleSize="4" >> maxShingleSize="4" outputUnigrams="false"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> </fieldType> >> >> >> I'm using for a field, indexing, then looking at the terms component. >> I'm seeing shingles that consist of only 2 terms, whereas I'm >> expecting all the terms to be at least 4 terms... What's up? Thanks. >> >