On 27. aug.. 2008, at 19.44, Glenn-Erik Sandbakken wrote:
At sesam.no we want to replace a FAST (fast.no) Query Matching Server
with a Solr index.
The index we are trying to replace is not a regular index, but
specially
configured to perform phrases (and sub-phrases) matches against
several
large lists (like an index with only a 'title' field).
I'm not sure of a correct, or logical, name for the behavior we are
after, but it is like a combination between Shingles and exact
matching.
Some examples should explain it well.
In order to do this, you can´t use the ShingleFilter during indexing
since a document like "one two three" and a query like "one two four"
will match since they have the shingle "one two" in common.
You will get what you want, I think, if you don´t tokenize during
indexing (some normalization will be required if your lists aren't
normalized to begin with) and apply the ShingleFilter only to the
queries.
Svein