We're performing fuzzy searches on a field possessing a large number of unique terms. Specifying a required minimum similarity of 0.7 results in a query execution time of 13-15 seconds, which stands in stark contrast to our average query time of 40ms.
We suspect that the performance problem most likely emanates from the enumeration over all the unique terms in the index. The Lucene documentation for FuzzyQuery supports this theory with the following warning: *"Warning:* this query is not very scalable with its default prefix length of 0 - in this case, *every* term will be enumerated and cause an edit score calculation." We would therefore like to set the prefix length to one or two, mandating that the first couple of characters match and thereby substantially reduce the number of terms enumerated. Is this possible with Solr? I haven't yet discovered a method, if so. Any help would be greatly appreciated.