Steve, Yes, I want only "one", "one two", and "one two three", but nothing else. Cool if this can be achieved without java code even better, I'll check that filter.
I need this for building a field used for suggestions, the user specifically wants no match only from the edge. thanks! On Sat, Mar 16, 2013 at 10:22 PM, Steve Rowe <sar...@gmail.com> wrote: > Hi xavier, > > It's not clear to me what you want. Is the "edge" you're referring to the > beginning of a field? E.g. raw text "one two three four" with > EdgeShingleFilter configured to produce unigrams, bigrams and trigams would > produce "one", "one two", and "one two three", but nothing else? > > If so, I suspect writing a LimitTokenPositionFilter (which would stop > emitting tokens after the token position exceeds a specified limit) would > be better, rather than subclassing ShingleFilter. You could use > LimitTokenCountFilter as a model, especially its "comsumeAllTokens" option. > I think this would make a nice addition to Lucene. > > Also, what do you plan to use this for? > > Steve > > On Mar 16, 2013, at 5:02 PM, xavier jmlucjav <jmluc...@gmail.com> wrote: > > Hi, > > > > I need to use shingles but only keep the ones that start from the edge. > > > > I want to confirm there is no way to get this feature without subclassing > > ShingleFilter, cause I thought someone would have already encountered > this > > use case.... > > > > thanks > > xavier > >