Manipulating a Fuzzy Query's Prefix Length

Kyle Lee Wed, 20 Jul 2011 08:09:34 -0700

We're performing fuzzy searches on a field possessing a large number of
unique terms. Specifying a required minimum similarity of 0.7 results in a
query execution time of 13-15 seconds, which stands in stark contrast to our
average query time of 40ms.


We suspect that the performance problem most likely emanates from the
enumeration over all the unique terms in the index. The Lucene documentation
for FuzzyQuery supports this theory with the following warning:

*"Warning:* this query is not very scalable with its default prefix length
of 0 - in this case, *every* term will be enumerated and cause an edit score
calculation."

We would therefore like to set the prefix length to one or two, mandating
that the first couple of characters match and thereby substantially reduce
the number of terms enumerated. Is this possible with Solr? I haven't yet
discovered a method, if so. Any help would be greatly appreciated.

Manipulating a Fuzzy Query's Prefix Length

Reply via email to