Re: Position increment in WordDelimiterFilter.

Binoy Dalal Thu, 14 Jan 2016 05:19:38 -0800

Irrespective of it what I want to understand why there is an increment in
position. Should not all the terms be at same position as they are yielded
from the same term/token?


No they won't.
The positions are incremented because typically these splits are used in
phrase queries which solr might autogenenerate or you might have enabled.
For a phrase query to work in a case such as 3d solr needs to know that 3
comes before d and not the other way around.
In the case that all the positions are the same, solr won't be able to tell
that 3d could be a phrase and hence won't be able to query it as such.
I hope that you understand what I'm trying to say.

On Thu, 14 Jan 2016, 18:12 Modassar Ather <modather1...@gmail.com> wrote:

> Thanks for your responses.
>
> Why do you think it should be at position 1? In that case searching for "3
> d" would not find anything. Is it what you expect?
> During search some of the results returned are not wanted. Following is the
> example.
> Search query: "3d image"
> Search results with 3-d image/3 d image/1d image are also returned. This is
> happening because of position increment.
> Another example is "1d obj*" returning results containing "d-object"
> related results. This can bring a completely different search item. Here
> the token d matches with d of d-object as this term is again split same
> way.
> The position increment will also cause the "3d image" search fail on a
> document containing "3d image" as the "d" comes at position 2.
>
> 1) can you confirm if you've made a typo while typing out your results?
> I have confirmed the position attribute displayed on analysis page and I
> found there is no typo.
> 2 ) you'll get the d and 3d as 2 since they're the 2nd token once 3d is
> split.
> Irrespective of it what I want to understand why there is an increment in
> position. Should not all the terms be at same position as they are yielded
> from the same term/token?
>
> Best,
> Modassar
>
> On Thu, Jan 14, 2016 at 3:25 PM, Binoy Dalal <binoydala...@gmail.com>
> wrote:
>
> > I've tried out your settings and here's what I get:
> > 3d 1
> > 3   1
> > d   2
> > 3d 2
> >
> > 1) can you confirm if you've made a typo while typing out your results?
> > 2 ) you'll get the d and 3d as 2 since they're the 2nd token once 3d is
> > split.
> > Try the same thing with d3 and you'll get 3 and d3 at position 2
> >
> > On Thu, 14 Jan 2016, 15:11 Emir Arnautovic <emir.arnauto...@sematext.com
> >
> > wrote:
> >
> > > Hi Modassar,
> > > Why do you think it should be at position 1? In that case searching for
> > > "3 d" would not find anything. Is it what you expect?
> > >
> > > Thanks,
> > > Emir
> > >
> > > On 14.01.2016 10:15, Modassar Ather wrote:
> > > > Hi,
> > > >
> > > > I have following definition for WordDelimiterFilter.
> > > >
> > > > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> > > > generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> > > > catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/>
> > > >
> > > > The analysis of 3d shows following four tokens and their positions.
> > > >
> > > > token         position
> > > > 3d             1
> > > > 3               1
> > > > 3d             1
> > > > d               2
> > > >
> > > > Please help me understand why d is at 2? Should not it also be at
> > > position
> > > > 1.
> > > > Is it a bug and if not is there any attribute which I can use to
> > restrict
> > > > the position increment?
> > > >
> > > > Thanks,
> > > > Modassar
> > > >
> > >
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > > --
> > Regards,
> > Binoy Dalal
> >
>
-- 
Regards,
Binoy Dalal

Re: Position increment in WordDelimiterFilter.

Reply via email to