The shingle filter should respect positions. If it doesn't, that is worth filing a bug so we know about it.
wunder On May 3, 2013, at 10:50 AM, Jack Krupansky wrote: > In short, no. I don't think you want to use the shingle filter on a token > stream that has multiple tokens at the same position, otherwise, you will get > confused "suggestions", as you've encountered. > > -- Jack Krupansky > > -----Original Message----- From: Rounak Jain > Sent: Friday, May 03, 2013 7:34 AM > To: solr-user@lucene.apache.org > Subject: Configure Shingle Filter to ignore ngrams made of tokens with same > start and end > > Hello, > > I was using Shingle Fitler with Suggester to implement an autosuggest > dropdown. The field I'm using with shingle filter has a worddelimiter with > preserveoriginal=1 to tokenize "women's" as "women's" and "womens." > > Because of this, when shingle filter is generating word ngrams, apart from > the expected tokens, there's also a "women's womens" tokens. I wanted to > know if there's any way to configure ShingleFilter so that it ignores > tokens with same start and end values. > > Thanks, > Rounak