I am trying to use shingles and position filter to make a query for
"foot print", for example, match either "foot print" or "footprint".
From the docs: using the PositionFilter
<http://wiki.apache.org/solr/PositionFilter> in combination makes it
possible to make all shingles synonyms of each other.
I've configured my analyzer like this:
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2"
maxShingleSize="2" outputUnigrams="true"
outputUnigramsIfNoShingles="false" tokenSeparator=""/>
<filter class="solr.PositionFilterFactory"/>
</analyzer>
user query: "foot print"
Without PositionFilterFactory, parsed query:+(((title:foot)
(title:print))~2) (title:"(foot footprint) print")
With PositionFilterFactory, parsed query: +(((title:foot)
(title:print))~2) ()
Why, when I add PositionFilterFactory into the mix, is the "footprint"
shingle is omitted?
Output of analysis:
WT
text
raw_bytes
start
end
position
type
foot
[66 6f 6f 74]
0
4
1
word
print
[70 72 69 6e 74]
5
10
2
word
SF
text
raw_bytes
start
end
positionLength
type
position
foot
[66 6f 6f 74]
0
4
1
word
1
footprint
[66 6f 6f 74 70 72 69 6e 74]
0
10
2
shingle
1
print
[70 72 69 6e 74]
5
10
1
word
2
Thanks,
Carrie Coy