Are you saying that WiFi Wi-Fi and Wi Fi should not match each other? I am using WhiteSpaceTokenizer in my analysis chain so wi fi becomes two different token. Please refer to my examples given in previous mail about the issues faced. Wi Fi are two term which will match but what happens if for a content having *WiFi device* is searched with *"WiFi device"*. It will not match as there is a position increment by WordDelimiterFilter for WiFi. "WiFi device"~1 will match which is confusing that there is no gap in the content why a slop is required.
Why do you use WordDelimiterFilter? Can you give us few examples where it is useful? It is useful when a word like* lucene-search documentation *is indexed with WordDelimiterFilter and it is broken in two terms like lucene and search then it will be helpful to get the documents containing it for queries like lucene documentation or search documentation. Best, Modassar On Fri, Jan 15, 2016 at 2:14 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Modassar, > Are you saying that WiFi Wi-Fi and Wi Fi should not match each other? Why > do you use WordDelimiterFilter? Can you give us few examples where it is > useful? > > Thanks, > Emir > > > On 15.01.2016 05:13, Modassar Ather wrote: > >> Thanks for your responses. >> >> It seems to me that you don't want to split on numbers. >> It is not with number only. Even if you try to analyze WiFi it will create >> 4 token one of which will be at position 2. So basically the issue is with >> position increment which causes few of the queries behave unexpectedly. >> >> Which release of Solr are you using? >> I am using Lucene/Solr-5.4.0. >> >> Best, >> Modassar >> >> On Thu, Jan 14, 2016 at 9:44 PM, Jack Krupansky <jack.krupan...@gmail.com >> > >> wrote: >> >> Which release of Solr are you using? Last year (or so) there was a Lucene >>> change that had the effect of keeping all terms for WDF at the same >>> position. There was also some discussion about whether this was either a >>> bug or a bug fix, but I don't recall any resolution. >>> >>> -- Jack Krupansky >>> >>> On Thu, Jan 14, 2016 at 4:15 AM, Modassar Ather <modather1...@gmail.com> >>> wrote: >>> >>> Hi, >>>> >>>> I have following definition for WordDelimiterFilter. >>>> >>>> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" >>>> generateNumberParts="1" catenateWords="1" catenateNumbers="1" >>>> catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/> >>>> >>>> The analysis of 3d shows following four tokens and their positions. >>>> >>>> token position >>>> 3d 1 >>>> 3 1 >>>> 3d 1 >>>> d 2 >>>> >>>> Please help me understand why d is at 2? Should not it also be at >>>> >>> position >>> >>>> 1. >>>> Is it a bug and if not is there any attribute which I can use to >>>> restrict >>>> the position increment? >>>> >>>> Thanks, >>>> Modassar >>>> >>>> > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > >