The shingle filter should respect positions. If it doesn't, that is worth 
filing a bug so we know about it.

wunder

On May 3, 2013, at 10:50 AM, Jack Krupansky wrote:

> In short, no. I don't think you want to use the shingle filter on a token 
> stream that has multiple tokens at the same position, otherwise, you will get 
> confused "suggestions", as you've encountered.
> 
> -- Jack Krupansky
> 
> -----Original Message----- From: Rounak Jain
> Sent: Friday, May 03, 2013 7:34 AM
> To: solr-user@lucene.apache.org
> Subject: Configure Shingle Filter to ignore ngrams made of tokens with same 
> start and end
> 
> Hello,
> 
> I was using Shingle Fitler with Suggester to implement an autosuggest
> dropdown. The field I'm using with shingle filter has a worddelimiter with
> preserveoriginal=1 to tokenize "women's" as "women's" and "womens."
> 
> Because of this, when shingle filter is generating word ngrams, apart from
> the expected tokens, there's also a "women's womens" tokens. I wanted to
> know if there's any way to configure ShingleFilter so that it ignores
> tokens with same start and end values.
> 
> Thanks,
> Rounak 




Reply via email to