Hi Shubham,

In other words, *you specify a large positionIncrementGap to make sure that
your queries don't match across multiple values of a field*.

For example, for a query like title:"paper plate making machine", you don't
want it to match with doc having two values for title:"paper plate",
"making machine". A positionIncrementGap of 100 will make sure that "plate"
and "making" is 100 position apart. To make you understand better, notice
the positions of terms (in format token -> position) and remember that
position matching matters in Lucene Query Matching:


   - paper -> 1
   - plate -> 2
   - making -> 3
   - machine -> 4

Now with positionIncrementGap of 0, the doc will have these positions for
title:

   -
   - paper -> 1
   - plate -> 2
   - making -> 3 (0+3)
   - machine -> 4 (0+4)

which will match the query. But if we have a positionIncrementGap of 100,
the doc will have these positions for title:

   -
   - paper -> 1
   - plate -> 2
   - making -> 103 (100+3)
   - machine -> 104 (100+4)

which will not match the* "exact"* query due to different positions.

Hope this helps. My position calculation is bit different from
@erickerick...@gmail.com <erickerick...@gmail.com> as I tried to replicate
the maths I could understand from the source code. Please feel free to
correct if not. Anyways, the idea remains same. :)

On Fri, 18 Oct 2019 at 18:36, Erick Erickson <erickerick...@gmail.com>
wrote:

> I really don’t understand the question. The field has to be multiValued,
> but there’s no other restriction. It’s all about whether a document you
> input has the same field name specified more than once, i.e. is
> multiValued. That’s why the example I gave has <field name=“blah”…. twice.
>
> Imagine you’re indexing a document. The client side breaks up the doc on
> sentence boundaries and enters them as multiple mentions of the same field,
> i.e.
> <doc>
>   <field name=“content”>sentence one</field>
>   <field name=“content”>sentence two</field>
>   <field name=“content”>sentence three</field>
>   <field name=“content”>sentence four</field>
>   <field name=“content”>sentence five</field>
> </doc>
>
> I think you’re missing the implication that the incoming document
> _already_ has the multiple fields put there by the time it gets to Solr.
>
> Best,
> Erick
>
>
> > On Oct 18, 2019, at 2:28 AM, Shubham Goswami <shubham.gosw...@hotwax.co>
> wrote:
> >
> > Hi Erick
> >
> > Thanks for reply and your example is very helpful.
> > But i think we can only use this attribute if we are getting data from a
> > single field
> > which has the copy of all data from every field.
> > Please correct me if i am wrong.
> > Thanks for your great support.
> >
> > Shubham
> >
> > On Thu, Oct 17, 2019 at 5:56 PM Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> First, it only counts if you add multiple entries for the field.
> Consider
> >> the following
> >> <doc>
> >>   <field name=“blah”>a b c</field>
> >>   <field name=“blah”>def</field>
> >> </doc>
> >>
> >> where the field has a positionIncrementGap of 100. The term positions of
> >> the entries are
> >> a:1
> >> b:2
> >> c:3
> >> d:103
> >> e:104
> >> f:105
> >>
> >> Now consider the doc where there’s only one field:
> >> <doc>
> >>   <field name=“blah”>a b c d e f</field>
> >> </doc>
> >>
> >> The term positions are
> >> a:1
> >> b:2
> >> c:3
> >> d:4
> >> e:5
> >> f:6
> >>
> >> The use-case is if you, say, index individual sentences and want to
> match
> >> two or more words in the _same_ sentence. You can specify a phrase query
> >> where the slop is < the positionIncrementGap. So in the first case, if I
> >> search for “a b”~99 I’d get a match. But if I searched for “a d”~99 I
> >> wouldn’t.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Oct 17, 2019, at 2:09 AM, Shubham Goswami <
> shubham.gosw...@hotwax.co>
> >> wrote:
> >>>
> >>> Hi Community
> >>>
> >>> I am a beginner in solr and i am trying to understand the working of
> >>> positionIncrementGap but i am still not clear how it exactly works for
> >> the
> >>> phrase queries and general queires.
> >>>  Can somebody please help me to understand this with the help fo an
> >>> example ?
> >>> Any help will be appreciated. Thanks in advance.
> >>>
> >>> --
> >>> *Thanks & Regards*
> >>> Shubham Goswami
> >>> Enterprise Software Engineer
> >>> *HotWax Systems*
> >>> *Enterprise open source experts*
> >>> cell: +91-7803886288
> >>> office: 0731-409-3684
> >>> http://www.hotwaxsystems.com
> >>
> >>
> >
> > --
> > *Thanks & Regards*
> > Shubham Goswami
> > Enterprise Software Engineer
> > *HotWax Systems*
> > *Enterprise open source experts*
> > cell: +91-7803886288
> > office: 0731-409-3684
> > http://www.hotwaxsystems.com
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Software Programmer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.

Reply via email to