An issue exists for this problem: https://issues.apache.org/jira/browse/LUCENE-3475
On May 3, 2013, at 11:00 AM, Walter Underwood <wun...@wunderwood.org> wrote: > The shingle filter should respect positions. If it doesn't, that is worth > filing a bug so we know about it. > > wunder > > On May 3, 2013, at 10:50 AM, Jack Krupansky wrote: > >> In short, no. I don't think you want to use the shingle filter on a token >> stream that has multiple tokens at the same position, otherwise, you will >> get confused "suggestions", as you've encountered. >> >> -- Jack Krupansky >> >> -----Original Message----- From: Rounak Jain >> Sent: Friday, May 03, 2013 7:34 AM >> To: solr-user@lucene.apache.org >> Subject: Configure Shingle Filter to ignore ngrams made of tokens with same >> start and end >> >> Hello, >> >> I was using Shingle Fitler with Suggester to implement an autosuggest >> dropdown. The field I'm using with shingle filter has a worddelimiter with >> preserveoriginal=1 to tokenize "women's" as "women's" and "womens." >> >> Because of this, when shingle filter is generating word ngrams, apart from >> the expected tokens, there's also a "women's womens" tokens. I wanted to >> know if there's any way to configure ShingleFilter so that it ignores >> tokens with same start and end values. >> >> Thanks, >> Rounak > > > >