> Looks to me like MultiPhraseQuery is getting in the way.  Shingles
> that begin at the same word are given the same position by
> ShingleFilter, and Solr's FieldQParserPlugin creates a
> MultiPhraseQuery when it encounters tokens in a query with the same
> position.  I think what you want is to convert queries into shingle
> disjunctions (*any* matching shingle results in a hit),  right?

Yes you're right Steve. thank you.

One way, i see now, to get the behaviour i want is to set the unigrams'
positionIncrement to zero instead of one.

For example in ShingleFilter.fillOutputBuffer(..) replacing the two
ocurrances of 
> .setPositionIncrement(1);
with
> .setPositionIncrement(0);

Then i end up with a MultiPhraseQuery with
        termArrays[0] = { list_entry_shingles:abcd
                          list_entry_shingles:abcd efgh
                          list_entry_shingles:abcd efgh ijkl 
                          list_entry_shingles:efgh
                          list_entry_shingles:efgh ijkl 
                          list_entry_shingles:ijkl }

and it works perfectly :-)

I see no way of configuring this behaviour though. 
 If it is possible and someone can say how this would be a real godsend.

Otherwise would a patch to ShingleFilter that offers an option
"unigramPositionIncrement" (that defaults to 1) likely be accepted into
trunk?

~mck

-- 
"Between two evils, I always pick the one I never tried before." Mae
West 
| semb.wever.org | sesat.no | sesam.no |

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to