Thanks for your response Cario. On Wed, Aug 12, 2015 at 10:20 PM, Cario, Elaine < elaine.ca...@wolterskluwer.com> wrote:
> Modassar, > > There are additional settings in WDFF that you can experiment with (google > around for the javadocs for the filter). Specific to your question, there > is splitOnNumerics param, which might be defaulting to true ("1") causing > terms like "3d" to get tokenized as "3" and "d". If you set it to 0 it may > correct the behavior you're seeing. (You'll need to re-index your content > to see the effect). > > Also, the standard practice that I've seen is that settings which create > additional tokens are usually only applied at index time, and not applied > during query time analysis (on the theory that you've indexed all the > different ways the user can search for a term, so there's no need to > actually modify the query to get a match). > > -----Original Message----- > From: Modassar Ather [mailto:modather1...@gmail.com] > Sent: Friday, August 07, 2015 12:21 AM > To: solr-user@lucene.apache.org > Subject: Re: Clarification on WordDelimiterFilter. > > Hi, > > Any suggestion will be really helpful. Kindly provide your inputs. > > Thanks, > Modassar > > On Thu, Aug 6, 2015 at 2:06 PM, Modassar Ather <modather1...@gmail.com> > wrote: > > > I am using WordDelimiterFilter while indexing and searching both with > > the following attributes. Parser used is edismax. Solr version is 5.2.1. > > > > *<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > > catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/>* > > > > During search some of the results returned are not wanted. Following > > is the example. > > > > Search query: "3d image" > > Search results with 3-d image/3 d image/1d image are also returned. As > > per analysis page this is happening because of position increment in > > the token as explained below. > > > > On the analysis page it shows following four tokens for 3d and there > > positions. > > token position > > 3d 1 > > 3 1 > > 3d 1 > > d 2 > > > > image 3 > > > > Another example is "1d obj*" returning results containing "d-object" > > related result. This can bring a completely different search item. > > > > Here the token d is at position 2 which is causing the above matches. > > Please help me understand why this position increment is done? > > The position increment will also cause the "3d image" search fail on a > > document containing "3d image" as the "d" comes at position 2. > > > > Kindly help me understand the best practices of using > > WordDelimiterFilter or provide your inputs how we can resolve the issue. > > > > Thanks, > > Modassar > > >