I am trying to use shingles and position filter to make a query for "foot print", for example, match either "foot print" or "footprint". From the docs: using the PositionFilter <http://wiki.apache.org/solr/PositionFilter> in combination makes it possible to make all shingles synonyms of each other.

I've configured my analyzer like this:
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="2" outputUnigrams="true" outputUnigramsIfNoShingles="false" tokenSeparator=""/>
<filter class="solr.PositionFilterFactory"/>
</analyzer>

user query:  "foot print"

Without PositionFilterFactory, parsed query:+(((title:foot) (title:print))~2) (title:"(foot footprint) print")

With PositionFilterFactory, parsed query: +(((title:foot) (title:print))~2) ()

Why, when I add PositionFilterFactory into the mix, is the "footprint" shingle is omitted?

Output of analysis:

WT
        
text
raw_bytes
start
end
position
type

        
foot
[66 6f 6f 74]
0
4
1
word

        
print
[70 72 69 6e 74]
5
10
2
word

SF
        
text
raw_bytes
start
end
positionLength
type
position

        
foot
[66 6f 6f 74]
0
4
1
word
1

        
footprint
[66 6f 6f 74 70 72 69 6e 74]
0
10
2
shingle
1

        
print
[70 72 69 6e 74]
5
10
1
word
2



Thanks,
Carrie Coy







Reply via email to