Hello all, We are having problems with extremely slow phrase queries when the phrase query contains a common words. We are reluctant to just use stop words due to various problems with false hits and some things becoming impossible to search with stop words turned on. (For example "to be or not to be", "the who", "man in the moon" vs "man on the moon" etc.)
The approach to this problem used by Nutch looks promising. Has anyone ported the Nutch CommonGrams filter to Solr? "Construct n-grams for frequently occuring terms and phrases while indexing. Optimize phrase queries to use the n-grams. Single terms are still indexed too, with n-grams overlaid." http://lucene.apache.org/nutch/apidocs-0.8.x/org/apache/nutch/analysis/C ommonGrams.html Tom Tom Burton-West Information Retrieval Programmer Digital Library Production Services University of Michigan Library