At sesam.no we want to replace a FAST (fast.no) Query Matching Server with a Solr index.
The index we are trying to replace is not a regular index, but specially configured to perform phrases (and sub-phrases) matches against several large lists (like an index with only a 'title' field). I'm not sure of a correct, or logical, name for the behavior we are after, but it is like a combination between Shingles and exact matching. Some examples should explain it well. Lets say we have the following list: > one two three > one two > two three > one > two > three > three two > two one > one three > three one For the query "one two three", we need hits against, and only against: > one two three > one two > two three > one > two > three For the query "one two", we need hits against, and only against: > one two > one > two For the query "one three four" (or "four one three"), we need hits against, and only against: > one three > one > three For the query "one two sesam three", we need hits against, and only against: > one two > one > two > three We have been testing out solr with the ShingleFilter for this, but without luck. I am unsure whether the reason is misconfiguration in schema.xml or that the ShingleFilter actually don't support this type of behavior. Attached our current schema.xml (it is different from when I made this post to the solr-dev mailinglist, the shingle "fieldType" is of class "solr.StrField") Attached is screenshots of the solr/admin/analysis.jsp against this configuration. I'd like to know if the SchingleFilter is at all able to do what we want. If it is: How can I configure schema.xml? If not: does there exist any other solutions that we can incorporate into solr which will give us this behavior? If there is no existing solution to this, we will probably end up writing our own methods for it, extending the ShingleFilter, gladly contributing to the solr project =) Thanks for a great product, Glenn-Erik
schema.xml
Description: XML document