Read the Lucene analysis package summary section entitled "Field Section
Boundaries":
http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/analysis/package-summary.html
TL;DR - if you leave it as the default, then a word at the end of one
section and a word at the start of the next section would be an exact phrase
match. You might ask why Lucene chose that default - I don't know, but Solr
"best practice" is the opposite. I suspect that Solr chose a large number
like 100 so that a phrase query could use a significant slop like 10 and
still not match across sections.
In my e-book I have a section entitled "Position Increment Gap" in Chapter 2
"Analyzers Overview" that details the reasoning as well. There is also
another section with the same title in the Term Vector Component chapter
that runs through an example in more detail.
See:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
-- Jack Krupansky
-----Original Message-----
From: Alexandre Rafalovitch
Sent: Sunday, October 12, 2014 7:40 PM
To: solr-user
Subject: What happens if you don't set positionIncrementGap
Hello,
I am working on - yet another - minimal schema, which involves the
settings that are matching defaults (or non-harming if defaults are
used). The one I am trying to figure out now is: positionIncrementGap
We set it to a 100 in all text field definitions. Does it mean it is
NOT some reasonable number by default?
I tried to trace it and all I can find is a default value in
SolrAnalyzer, which is 0.
But if it is 0 (zero), then why do we explicitly define to be 0 in all
non-text fields? Would seem to be redundant and - frankly - confusing.
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853