>In order to do this, you can't use the ShingleFilter during indexing  
>since a document like "one two three" and a query like "one two four"  
>will match since they have the shingle "one two" in common.
Hello Svein, nice to meet you in this place =)
I have been trying with and without <analyzer type="index">
and also <analyzer type="query">
I have also been trying with and without outputUnigrams="true" for
analyzer type=index and analyzer type=query
And I have been trying with and without outputUnigramIfNoNgram="true"
for analyzer type=index (only)
I am pretty sure I have been trying all possible combinations of
switching all of this on and off.
I have never seen exactly the expected result.

>You will get what you want, I think, if you don't tokenize during  
>indexing (some normalization will be required if your lists aren't  
>normalized to begin with) and apply the ShingleFilter only to the  
>queries.
I also think that this sounds like the most logical configuration,
but such a configuration doesn't give us the expected results.
(Un?=)fortunately I am leaving on a two week vacation in one hour.
I'd love to follow up on this the coming days,
but Mick Semb Wever will be taking over this job for the next two weeks.

- Glenn-Erik Sandbakken

Reply via email to